Google BigQuery as Target
Fivetran HVR supports integrating changes into Google BigQuery location. This section describes the configuration requirements for integrating changes using Integrate and Refresh into BigQuery location. For the list of supported BigQuery versions into which HVR can integrate changes, see Integrate changes into location in Capabilities.
The preferred methods for writing data into BigQuery are using Burst Integrate and Bulk Refresh as they provide better performance. However, it is required to create staging files on a temporary location to perform Burst Integrate and Bulk Refresh. For more information about staging, see section Staging for BigQuery.
Continuous Integrate is not recommended for replication to BigQuery. The default and highly efficient method is Burst Inegrate, as BigQuery is optimized for batch processing. Applying changes one-by-one (Continuous) is significantly less efficient.
We strongly advise against using Continuous Integrate. Not only is it inefficient, but you may also encounter issues with replication. We recommend utilizing Burst Integrate for optimal performance and reliability when replicating to BigQuery.
Multi-statement Transactions
By default, HVR applies changes in BigQuery using auto-commit. The limitation of using auto-commit is that the HVR cannot properly recover if Integrate exits during an integrate cycle, which can result in creating duplicates in the target. To overcome this limitation, you can instead use multi-statement transactions.
To enable multi-statement transactions, you must define the following environment variable:
Action | Parameters |
---|---|
Environment | Name=HVR_BIGQUERY_ENABLE_SESSIONS Value=1 |
When multi-statement transactions are enabled, replication fails if the channel contains more than 99 tables since BigQuery does not support more than 100 tables in the same transaction."
Grants for Integrate and Refresh
This section lists the grants/permissions required for integrating changes into Google BigQuery.
The HVR database user must be granted the following three roles:
These three roles are required for granting the following permissions -
storage.buckets.get
,storage.objects.create
,storage.objects.delete
,storage.objects.get
,storage.objects.list
,bigquery.jobs.create
,bigquery.datasets.get
,bigquery.routines.get
,bigquery.routines.list
,bigquery.tables.create
,bigquery.tables.delete
,bigquery.tables.get
,bigquery.tables.getData
,bigquery.tables.list
,bigquery.tables.updateData
.