BigQuery Partitioned Tableslink
Google supports partitioned tables in BigQuery. There are two types of table partitioning:
- Tables partitioned by ingestion time: Tables partitioned based on the data's ingestion (load) date or arrival date
- Partitioned tables: Tables that are partitioned based on a
Partitioning your tables can decrease the costs of writing to and querying from your BigQuery warehouse. Fivetran still performs full table scans if your partition key is not based on your source's primary key so that we avoid writing duplicates to different partitions within the same table. If your partition key is based on your source's primary key, we automatically optimize how we write to BigQuery.
Fivetran supports both tables partitioned by ingestion time and partitioned tables. Read more about partitioning tables in Google's BigQuery documentation.
Tables partitioned by ingestion timelink
To use tables partitioned by ingestion time, you must perform these setup steps before the sync begins:
- Go to the BigQuery web UI in the GCP console.
- In the navigation bar on the left of the console, select your project.
- Find the dataset that Fivetran will use as your destination. If you don’t have a dataset you’ve selected for this purpose, create one now. (See Google's documentation for how to create a dataset.)
Note: When you create the connector that will use this dataset, you must name your destination schema (which also becomes the connector name) with the same name as the dataset.
bigquery.dataOwneraccess for this dataset to the service account so that it can create, delete, and update tables in the dataset. (See Google BigQuery documentation on controlling access to datasets for detailed instructions.)
- In the navigation bar just above the detail view of your selected dataset, select + Create Table.
- In the Create table modal, retain the default settings in the Source, Project name, Dataset name, Table type, and Schema sections.
- Under Table name, enter the same name as the corresponding table in the Fivetran ERD for your connector. (Find your connector's ERD in the Fivetran application docs.) If your connector doesn't have an ERD, you must determine the table name in your source, and use that name for your BigQuery destination table.
- Under Partition and cluster settings, select Partition by ingestion time.
- Select Create table at the bottom of the modal.
For more details, including other ways to set up your table, see the Google BigQuery documentation on creating and using ingestion-time partitioned tables.
Tables partitioned based on a TIMESTAMP, or DATE, or INTEGERlink
You can convert your non-partitioned tables to partitioned tables. Follow the step-by-step instructions to convert a BigQuery non-partitioned table into a partitioned table.
Pause the connectorlink
Go to your Fivetran dashboard and select the connector for which you want to convert your non-partitioned tables into partitioned tables.
If your connector is enabled, it will show ENABLED status in the top right corner. If your connector is already paused, skip to Step 2.
Click the top right button to pause your connector. After you click the button, the status should change to PAUSED.
Convert to partitioned tableslink
To convert an existing table to a partitioned table, run the following SQL statements in the BigQuery console:
- If the column type is of TIMESTAMP type then use:
create table [schema-name].copy partition by date([timestamp-column]) as select * from [schema-name].[table-name]; drop table [schema-name].[table-name];
- If the column type is of DATE type then use:
create table [schema-name].copy partition by [date-column] as select * from [schema-name].[table-name]; drop table [schema-name].[table-name];
- If the column type is of INTEGER type then use:
create table [schema-name].copy partition by range_bucket([integer-column],GENERATE_ARRAY([start],[end],[interval])) as select * from [schema-name].[table-name]; drop table [schema-name].[table-name];
Go to the BigQuery web UI in the GCP console.
In the navigation bar on the left of the console, select your project.
Select your dataset. In the dataset, find the table you want to copy and select that table.
In the navigation bar just above the detail view of your selected table, select Copy Table.
Enter your original table name and click Copy.
Run the following SQL in the BigQuery Query Editor:
drop table [schema-name].copy;
Repeat the steps in this section (Convert to partitioned tables) for all the tables you want to convert from non-partitioned table to partitioned tables.
Re-enable the connectorlink
Go to your Fivetran Dashboard and enable the connector again.