Google BigQuery as Target

Fivetran HVR supports integrating changes into Google BigQuery location. This section describes the configuration requirements for integrating changes using Integrate and Refresh into BigQuery location. For the list of supported BigQuery versions into which HVR can integrate changes, see Integrate changes into location in Capabilities.

For integrating changes using Integrate and Refresh into BigQuery location, you must use staging files. For more information about staging files, see our Staging for BigQuery section.

The preferred methods for writing data into BigQuery are using Burst Integrate and Bulk Refresh as they provide better performance.

Row-wise Refresh is not supported for replication to BigQuery.
Continuous Integrate is not recommended for replication to BigQuery. The default and highly efficient method is Burst Integrate, as BigQuery is optimized for batch processing. Applying changes one-by-one (Continuous) is significantly less efficient.
We strongly advise against using Continuous Integrate. Not only is it inefficient, but you may also encounter issues with replication. We recommend utilizing Burst Integrate for optimal performance and reliability when replicating to BigQuery.

Grants for Integrate and Refresh

This section lists the grants/permissions required for integrating changes into Google BigQuery.

The HVR database user must be granted the following three roles, which include the required permissions:
- BigQuery Data Editor
  Required for granting the following permissions: bigquery.datasets.get, bigquery.routines.get, bigquery.routines.list, bigquery.tables.create, bigquery.tables.delete, bigquery.tables.get, bigquery.tables.getData, bigquery.tables.list, bigquery.tables.updateData
- BigQuery Job User
  Required for granting the following permission: bigquery.jobs.create
- Storage Admin
  Required for granting the following permissions: storage.buckets.get, storage.objects.create, storage.objects.delete, storage.objects.get, storage.objects.list

Multi-statement Transactions

By default, HVR applies changes in BigQuery using auto-commit. The limitation of using auto-commit is that the HVR cannot properly recover if Integrate exits during an integrate cycle, which can result in creating duplicates in the target. To overcome this limitation, you can instead use multi-statement transactions.

To enable multi-statement transactions, you must define the following environment variable:

Action	Parameters
Environment	Name=HVR_BIGQUERY_ENABLE_SESSIONS Value=1

When multi-statement transactions are enabled, replication fails if the channel contains more than 99 tables since BigQuery does not support more than 100 tables in the same transaction.

Intermediate Directory

This option in the HVR UI allows you to specify a directory path for storing intermediate (temporary) files generated during Compare. These files are created during both "direct file compare" and "online compare" operations.

This option is displayed in the location creation dialog when creating a new location, and in the Source and Target Properties pane on the Location Details page when editing an existing location.

Using an intermediate directory can enhance performance by ensuring that temporary files are stored in a location optimized for the system's data processing needs.

This setting is particularly relevant for target file locations, as it determines where the intermediate files are placed during the Compare operation. If this option is not enabled, the intermediate files are stored by default in the integratedir/_hvr_intermediate directory, where integratedir is the replication DIRECTORY (File_Path) defined for the target file location.

This option is equivalent to the location property Intermediate_Directory.

Intermediate Directory is Local

This option indicates that the Intermediate Directory will be created on the local drive of the location's server. This option is displayed when you select the Intermediate Directory option in the HVR UI. It is selected by default and cannot be modified.

Storing intermediate files locally is crucial for optimizing performance by reducing network latency and avoiding potential permission issues associated with remote storage. It enables HVR to process data more efficiently by leveraging the speed and reliability of local storage. This is particularly beneficial when the HVR Agent has access to ample local storage, allowing it to handle large data volumes without relying on networked storage solutions.

This option is equivalent to the location property Intermediate_Directory_Is_Local.

Known Issue

Timestamp values with fractional seconds in years before 1698 or after 2242 are reported as mismatches (Out-of-Sync) by HVR Compare. This happens due to a rounding error during data querying. For example, a timestamp value such as 1697-12-31 23:59:59.123 results in a mismatch.