BigQuery Setup Guide
Follow our setup guide to connect your BigQuery data warehouse to Fivetran.
NOTE: This is a BigQuery setup guide. For instructions on Fivetran-managed BigQuery, see our Managed BigQuery documentation.
Prerequisites
To connect BigQuery to Fivetran, you need the following:
- A BigQuery account or a Google Apps account
- A Fivetran role with the Create Destinations or Manage Destinations permissions
NOTE: Fivetran doesn't support BigQuery sandbox accounts.
Setup instructions
Find Project ID
You need to grant Fivetran access to your BigQuery cluster so we can create and manage tables for your data, and periodically load data into those tables.
Go to your Google Cloud Console's projects list.
Find your project on the list and make a note of the project ID in the ID column. You will need it to configure Fivetran.
Find Fivetran service account
Log in to your Fivetran account.
Go to the Destinations page, and then click Add destination.
In the pop-up window, enter a Destination name of your choice.
Click Add.
Select BigQuery as the destination type.
Enter the Project ID you found in Step 1.
Make a note of the Fivetran service account. You will need to grant it permissions in BigQuery.
Create service account
IMPORTANT: This is a mandatory step for the Hybrid Deployment model. For SaaS Deployment model, this is an optional step and you must perform it only if you want to use your own service account, instead of the Fivetran-managed service account, to control access to BigQuery.
Create a service account by following the instructions in Google Cloud's documentation.
Create a private key for your service account by following the instructions in Google Cloud's documentation. The private key must be in the JSON format. Make a note of all the contents of the JSON file. You will need it to configure Fivetran.
Example of private key:
{ "type": "service_account", "project_id": "random-project-12345", "private_key_id": "abcdefg", "private_key": "*****", "client_email": "name@project.iam.gserviceaccount.com", "client_id": "12345678", "auth_uri": "https://accounts.google.com/o/oauth2/auth", "token_uri": "https://oauth2.googleapis.com/token", "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/name%40project.iam.gserviceaccount.com" }
Configure service account
Go back to the IAM & admin tab, and go to the project principals list.
Select + GRANT ACCESS.
In the New Principals field, enter the Fivetran service account you found in Step 2 or the service account you created in Step 3. The service account is the entire email address.
Click Select a role > BigQuery > BigQuery User.
NOTE: For more information about roles, see our documentation.
VPC service perimeter configuration
You must set up a GCP bucket to ingest data from Fivetran in either of the following scenarios:
- You want to use the Hybrid Deployment model
- You use a service perimeter to control access to BigQuery
NOTE: The bucket must be present in the same location as the dataset location.
Assign permissions to service account
You must give the service account (in the setup form) Storage Object Admin permission for the bucket, so that it can read and write the data from the bucket.
In your Google Cloud Console, go Storage > Browser to see the list of buckets in your current project.
Select the bucket you want to use.
Go to Permissions and then click Add Principals.
In the Add principals window, enter the Fivetran service account you found in Step 2 or the service account you created in Step 3.
From the Select a role drop-down, select Storage Object Admin.
Set the life cycle of objects in the bucket
You must set a lifecycle rule so that data older than one day is deleted from your bucket.
In your Google Cloud Console, go Storage > Browser to see the list of buckets in your current project.
In the list, find the bucket you are using for Fivetran, and in the Lifecycle rules column, select its rules.
Click ADD A RULE. A detail view will open.
In the Select an action section, select Delete object.
Click CONTINUE.
In the Select object conditions section, select the Age checkbox and then enter 1.
Click CONTINUE and then click CREATE.
Complete Fivetran configuration
Log in to your Fivetran account.
Go to the Destinations page and click Add destination.
Enter a Destination name of your choice and then click Add.
Select BigQuery as the destination type.
In the destination setup form, enter the Project ID you found in Step 1.
(Optional for Enterprise and Business Critical accounts) If you want to use the Hybrid Deployment model, do the following:
i. Set the Enable hybrid deployment toggle to ON.
ii. Select an existing Hybrid Deployment Agent from the Agent name list or configure a new agent.
NOTE: To select an existing agent, go to the More Options menu for the agent you want to use and select Make agent default.
iii. In the Service Account Private Key field, paste all the contents of the private key JSON file you created.
iv. Enter the Customer Bucket name you configured.
(Optional for SaaS Deployment and not applicable to Hybrid Deployment) If you want to use your service account, do the following:
i. Set the Use own Service Account toggle to ON.
ii. In the Service Account Private Key field, paste all the contents of the private key JSON file that you created in Step 3.
Enter the Data Location.
(Optional for SaaS Deployment and not applicable to Hybrid Deployment) If you want to use your GCS bucket to process the data instead of a Fivetran-managed bucket, set the Use GCP Service Parameter toggle to ON, and then enter the Customer Bucket name you configured in Step 5.
NOTE: The Use GCP Service Parameter toggle does not appear for the Hybrid Deployment model.
Choose the Data processing location. Depending on the plan you are on and your selected cloud service provider, you may also need to choose a Cloud service provider and cloud region as described in our Destinations documentation.
NOTE: To use a Private Google Access connection, choose GCP as the Cloud service provider.
Choose your Time zone.
(Optional for Business Critical accounts and not applicable to Hybrid Deployment) To enable regional failover, set the Use Failover toggle to ON, and then select your Failover Location and Failover Region. Make a note of the IP addresses of the secondary region and safelist these addresses in your firewall.
Click Save & Test.
Fivetran tests and validates the BigQuery connection. On successful completion of the setup tests, you can sync your data using Fivetran connectors to the BigQuery destination.
In addition, Fivetran automatically configures a Fivetran Platform Connector to transfer the connector logs and account metadata to a schema in this destination. The Fivetran Platform Connector enables you to monitor your connectors, track your usage, and audit changes. The connector sends all these details at the destination level.
IMPORTANT: If you are an Account Administrator, you can manually add the Fivetran Platform Connector on an account level so that it syncs all the metadata and logs for all the destinations in your account to a single destination. If an account-level Fivetran Platform Connector is already configured in a destination in your Fivetran account, then we don't add destination-level Fivetran Platform Connectors to the new destinations you create.
Setup tests
Fivetran performs the following BigQuery connection tests:
The Connection test checks if we can connect to your BigQuery data warehouse and retrieve a list of the datasets.
The Check Permissions test validates if we have the required permissions on your data warehouse. The test also checks if billing has been enabled on your account and is not a sandbox account. As part of the test we:
- create a dataset to check if we have
bigquery.datasets.create
permissions on your data warehouse. - create a table in the dataset (
bigquery.tables.create
permissions) and insert a row in the table (bigquery.tables.updateData
permissions) to check if billing has been enabled. - create a job to check if we have
bigquery.jobs.create
permissions.
- create a dataset to check if we have
The Bucket Configuration Test verifies if we have the Storage Object Admin permission on your data bucket if you are using your own data bucket to process the data. The test also checks if the bucket is located in the same dataset. We skip this test if you are using a Fivetran-managed bucket.
NOTE: The tests may take a couple of minutes to finish running.
Related articles
description Destination Overview
settings API Destination Configuration