Initial Sync Taking Longer Than Expected
Issue
A Google Cloud Storage connection's initial sync is taking longer than expected.
Environment
Connector: Google Cloud Storage
Resolution
To improve sync performance:
- Consolidate small files into larger files when possible.
- Organize files into logical folder structures.
- Reduce the amount of historical data included in the initial sync.
- Monitor sync progress from the Fivetran dashboard.
Cause
Large historical syncs can take longer because of:
- High data volume
- A large numbers of source files
- Many small files with few records
- Source throughput limitations
- Destination throughput limitations
For example, syncing millions of small files with only hundreds or thousands of records per file can significantly increase sync duration because each file adds processing overhead. Larger, consolidated files generally improve performance.
The following warning is expected during long-running syncs and does not indicate a failure:
Infrastructure rescheduled sync for appropriate resource provisioning
We automatically resume syncing from the last checkpoint after allocating additional resources.
Rows may not immediately appear in the destination during the early stages of an initial sync. We may spend a considerable time processing and preparing source data before the load phase begins.