Snowflake

Snowflake is a cloud-based data warehousing platform that allows organizations to store and analyze large volumes of data in a scalable and cost-effective manner. The Fivetran Snowflake connector allows you to sync tables and views from one Snowflake database to your destination.

Fivetran supports Snowflake as both a database connector and a destination.

We can sync any number of schemas, tables, and views from within your Snowflake database. Fivetran supports regular and materialized views.

If you want to sync multiple Snowflake databases, we recommend that you create a connection for each database.

Known limitations

We do not support multi-factor authentication (MFA) for the Snowflake connector. If your Snowflake account requires MFA, use key pair authentication.

Features

Feature Name	Supported	Notes
Capture deletes		All tables and fields
History mode
Custom data		All tables and fields
Data blocking		Column level, table level, and schema level
Column hashing
Re-sync		Table level
API configurable		API configuration
Priority-first sync
Fivetran data models
Private networking		AWS PrivateLink Azure Private Link Google Cloud Private Service Connect
Authorization via API

Supported deployment models

We support the SaaS and Hybrid deployment models for the connector.

You must have an Enterprise or Business Critical plan to use the Hybrid Deployment model.

Setup guide

Follow our step-by-step Snowflake setup guide to connect your Snowflake database with Fivetran.

Sync overview

Once Fivetran is connected to your database, we pull a full dump of all selected data from your database. We then use either Fivetran Teleport or Time Travel-based sync to pull all your new and changed data at regular intervals. If data in your source database changes (for example, you add new tables or change a data type), Fivetran automatically detects and writes these changes into your destination.

Snowflake includes built-in system databases that store important metadata and account configuration details. These databases are part of the system by default and do not contain your actual business data. To optimize performance and storage, we do not sync this data.

Schema information

Fivetran tries to replicate the exact schema, tables, and views from your Snowflake source database to your destination.

We name the schema using the Destination schema prefix name you provide in the connection setup form. We replicate the source's tables and views to the destination schema. We name the destination schema with the connection name appended by the dataset name. For example, if the connection name is snowflake_db and the dataset name is fivetran_schema, the destination schema name is snowflake_db_fivetran_schema.

Type transformations and mapping

As we extract your data, we match the Snowflake data types to types that Fivetran supports. If we don't support a data type, we automatically change that type to the closest supported type or, in some cases, don't load that data at all. Our system automatically skips columns with data types that we don't accept or transform.

In Time Travel Sync, we support all primary keys whose data types are supported in regular columns. However, for Fivetran Teleport Sync, if we don't support the data type of the primary key, we treat the table as a primary key-less table.

The following table illustrates how we transform your Snowflake data types into Fivetran-supported types for regular and primary key columns:

Data Type	Fivetran Type	Regular Column Support	Primary Key Support (Teleport)
NUMBER* (s = 0)	One of the following types: - SHORT (p <= 4) - INTEGER (p <= 9) - LONG (p <= 18)	Yes	Yes
NUMBER* (s > 0 OR p > 18)	BIGDECIMAL	Yes	No
FLOAT	BIGDECIMAL	Yes	No
TEXT (with ASCII only)	STRING	Yes	Yes (Up to a length of 256 bytes)
TEXT (with Unicode)	STRING	Yes	Yes (Up to a length of 84 characters)
TEXT	STRING	Yes	Yes (Up to a length of 256 bytes)
BOOLEAN	BOOLEAN	Yes	No
TIME	STRING	Yes	No
TIMESTAMP_NTZ	LOCALDATETIME	Yes	No
TIMESTAMP_TZ	INSTANT	Yes	No
TIMESTAMP_LTZ	INSTANT	Yes	No
DATE	LOCALDATE	Yes	No
BINARY	BINARY	Yes	Yes (Length must be a multiple of 4 up to a max of 128 bytes and values must not contain leading zeros)
VARIANT	STRING	Yes	No
ARRAY	STRING	Yes	No
OBJECT	JSON	Yes	No
GEOGRAPHY	JSON	Yes	No
GEOMETRY	JSON	Yes	No

The NUMBER data type has a precision p and a scale s. The maximum supported values are p = 38 and s = 37. If we detect a precision or scale that exceeds those maximums, we map that data to the STRING data type and update the destination column type to STRING.

Date handling: If we encounter invalid dates (such as February 29 on non-leap years), we handle them gracefully by setting those values to NULL in the destination rather than failing the sync.

Date handling:

If the timestamp has 16 or more digits, we take the first 16 digits and interpret the value as epoch microseconds, converting it into a proper Timestamp with microsecond precision.
If the timestamp has 14–15 digits, we truncate it to the first 13 digits and treat it as epoch milliseconds.
If the timestamp has 13 or fewer digits, we directly interpret it as epoch milliseconds.
If parsing still fails after these conversions, we gracefully set those values to NULL in the destination rather than failing the sync.

Updating data

We perform incremental updates of any new or modified data from your source database to capture only the data that has changed since our last sync. Depending on the permissions you have on the database, we use one of the following incremental sync methods:

Time Travel Sync
Fivetran Teleport Sync

By default, we use Fivetran Teleport Sync to for incremental updates. However, we recommend that you use Time Travel Sync as your incremental sync method.

Time Travel Sync

Time Travel Sync uses the Change Data Capture (CDC) process to sync only the data that has changed since our last sync.

Time Travel Sync performs the following operations on your database:

Captures only the changes in the data as they occur
Leverage Snowflake Time Travel to access historical data

Fivetran Teleport Sync

Fivetran Teleport Sync queries perform the following operations on your database:

Perform a full table scan of each synced table
Perform calculations on all values in each synced table's rows
Aggregate a compressed table snapshot in the database's memory

For optimum Fivetran Teleport Sync performance, we recommend that you make the following resources available in your database:

1 GB Free RAM
1 Free CPU Core
IOPS (Teleport Sync times decrease linearly with an increase of available IOPS).

Teleport captures deletes differently than log-based replication mechanisms. Learn more in our Changes to data between syncs documentation.

Limitations

If you update a table during a sync, Fivetran Teleport Sync will not be able to detect these updates during the ongoing sync. In such cases, your connection will re-sync the entire table to capture the changes. To avoid this, we recommend increasing the sync frequency of the connection. To learn more about Fivetran Teleport Sync limitation, see our Teleport sync general limitations.

Automatic table re-syncs

For tables with a primary key, we support ADD COLUMN DDL operations with null or default static values. All other table or column alterations trigger an automatic table re-sync. For tables without a primary key, any DDL operation triggers an automatic table re-sync.

Tables with a primary key

We merge changes to tables with primary keys into the corresponding tables in your destination:

An INSERT in the source table generates a new row in the destination with _fivetran_deleted = FALSE.
A DELETE in the source table updates the corresponding row in the destination with _fivetran_deleted = TRUE.
An UPDATE in the source table updates the data in the corresponding row in the destination.

If any primary key column has an unsupported data type, then the table is automatically treated as primary key-less (which means create table permissions are required on the database or the workspace database).

Tables without a primary key

For tables without a primary key, we designate the _fivetran_id column as the primary key. We populate this column with the hash of the values of each row.

We handle changes to tables without a primary key differently:

An INSERT in the source table generates a new row in the destination with _fivetran_deleted = FALSE.
The _fivetran_id column helps us handle DELETE operations:
- If there is a row in the destination that has a corresponding _fivetran_id value, that row will be updated with _fivetran_deleted = TRUE.
- If there is not a row in the destination that has a corresponding _fivetran_id value, then no changes are done in the destination.
An UPDATE in the source table is treated as a DELETE followed by an INSERT, so it results in two rows in the destination:
- A row containing the old values with _fivetran_deleted = TRUE
- A row containing the new values with _fivetran_deleted = FALSE

As a result, one record in your source database may have several corresponding rows in your destination. For example, suppose you have a products table in your source database with no primary key:

description	quantity
Shrink-ray gun	1
Boogie robot	2
Cookie robot	3

You load this table into your destination during your initial sync, creating this destination table:

description	quantity	_fivetran_synced	_fivetran_index	_fivetran_deleted	_fivetran_id
Shrink-ray gun	1	'2000-01-01 00:00:00'	0	FALSE	asdf
Cookie robot	2	'2000-01-01 00:00:00'	1	FALSE	dfdf
Boogie robot	3	'2000-01-01 00:00:00'	2	FALSE	ewra

You then update a row:

UPDATE products SET quantity = 4 WHERE description = 'Cookie robot';

After your UPDATE operation, your destination table will look like this:

description	quantity	_fivetran_synced	_fivetran_index	_fivetran_deleted	_fivetran_id
Shrink-ray gun	1	'2000-01-01 00:00:00'	0	FALSE	asdf
Cookie robot	2	'2000-01-01 00:00:00'	3	TRUE	dfdf
Boogie robot	3	'2000-01-01 00:00:00'	2	FALSE	ewra
Cookie robot	4	'2000-01-01 00:00:00'	4	FALSE	zxfd

You then delete a row:

DELETE FROM products WHERE description = 'Boogie robot';

After your DELETE operation, your destination table will look like this:

description	quantity	_fivetran_synced	_fivetran_index	_fivetran_deleted	_fivetran_id
Shrink-ray gun	1	'2000-01-01 00:00:00'	0	FALSE	asdf
Cookie robot	2	'2000-01-01 00:00:02'	3	TRUE	dfdf
Cookie robot	4	'2000-01-01 00:00:02'	4	FALSE	zxfd
Boogie robot	3	'2000-01-01 00:00:02'	5	TRUE	ewra

So, while there may be just one record in your source database where description = Cookie robot, there are two in your destination - an old version where _fivetran_deleted = TRUE, and a new version where _fivetran_deleted = FALSE.

We also de-duplicate* rows before we load them into your destination. We use the _fivetran_id field, which is the hash of the non-Fivetran values in every row, to avoid creating multiple rows with identical contents. If, for example, you have the following table in your source:

*Fivetran Teleport Sync does not support de-duplication for tables without primary keys. Learn more in our Fivetran Teleport Sync documentation.

description	quantity
Shrink-ray gun	1
Shrink-ray gun	1
Shrink-ray gun	1

Then your destination table will look like this:

description	quantity	_fivetran_synced	_fivetran_index	_fivetran_deleted	_fivetran_id
Shrink-ray gun	1	'2000-01-01 00:00:00'	0	FALSE	asdf

Recommendations for Fivetran Teleport Sync

We recommend the following:

If you want to sync tables with primary keys, we recommend that the combined size of all the tables does not exceed 100 GB.
If you want to sync tables without primary keys, we recommend that the combined size of all the tables does not exceed 10 GB. Contact our support team for assistance with your syncs.

Deleted columns

If you delete a column in a source table, we replace all values in the deleted column with NULL values in the destination table.

Workspace for temporary tables

We create temporary tables in the source database to sync source tables without a primary key. By default, we create these tables in the same database and schema as the source table that we're syncing.

You can opt to create these temporary tables in a different location by using the Use working database for queries toggle in the connection setup form and then specifying the database and schema names.