Error: Parquet File Was Inaccessible
Issue
Queries on Snowflake Iceberg tables fail with errors similar to the following:
The parquet file '<location>/<file>.parquet' for table '<table>' was inaccessible.
Environment
- Destination: Managed Data Lake Service
- Query engine: Snowflake
Resolution
To resolve this issue:
- Ensure the tables are created with
AUTO_REFRESH = TRUE. - Ensure Snowflake refreshes the tables immediately after our daily table maintenance operation. Use the following commands in Snowflake to verify the refresh interval and table refresh status:
SHOW CATALOG INTEGRATIONS; //Look for the `REFRESH_INTERVAL_SECONDS` parameter SELECT SYSTEM$AUTO_REFRESH_STATUS('<table_name>'); - If necessary, reduce
REFRESH_INTERVAL_SECONDSto a lower value, such as 60 seconds.. For more information, see the [Snowflake documentation on maintaining tables that use an external catalog](Maintain tables that use an external catalog).
Cause
This issue occurs when our daily expire_snapshot maintenance removes old snapshots and unreferenced data files during the first sync of the day, but the Snowflake Iceberg table is not refreshed soon afterward. When this happens, queries can attempt to access Parquet files that have already been deleted.