Snowflake

Snowflake's architecture separates storage from computing, allowing customers to run Fivetran in a separate logical data warehouse. You can have multiple logical data warehouses running queries on the same underlying data. You can then provision one logical data warehouse as a destination for Fivetran to load new data into and another for your analysts to query the data.

Fivetran supports Snowflake as both a source and a destination.

You may have several data sources, each of which has its own complex schema. We sync the entire schema of every data source to Snowflake, automatically and continuously. You can connect all your data sources, which we'll sync several times a day, without worrying about bogging down analysts' queries.

A Snowflake account can have multiple compute warehouses and multiple databases. If you want to sync data into two databases in a single Snowflake account, and/or use multiple compute warehouses to do so, you will have to create two Fivetran destinations: one for each warehouse or database connection.

Fivetran uses the Snowflake JDBC driver version 3.23.2 to connect to your source and destination.

Supported deployment models

We support the SaaS and Hybrid deployment models for the destination.

You must have an Enterprise or Business Critical plan to use the Hybrid Deployment model.

Setup guide

Follow our step-by-step Snowflake setup guide to connect your Snowflake data warehouse with Fivetran.

Type transformation and mapping

The data types in your Snowflake data warehouse follow Fivetran's standard data type storage.

We use the following data type conversions:

Fivetran Data Type	Destination Data Type	Notes
BOOLEAN	BOOLEAN
SHORT	NUMBER(38,0)
INT	NUMBER(38,0)
LONG	NUMBER(38,0)
BIGDECIMAL	DECIMAL or DOUBLE PRECISION	We convert BIGDECIMAL to DOUBLE PRECISION only if: the column's precision exceeds `38` the column's scale exceeds `37` the column has no precision or scale defined
FLOAT	FLOAT
DOUBLE	FLOAT
LOCALDATE	DATE
LOCALDATETIME	TIMESTAMP_NTZ
INSTANT	TIMESTAMP_TZ
STRING	VARCHAR
XML	VARIANT
JSON	VARIANT
JSONB	VARIANT
BINARY	BINARY

The maximum allowed length for VARIANT and VARCHAR/TEXT columns is 16 MB. We truncate values in these columns if they exceed the maximum length.

Parameter Lengths for Textual Types

Data type parameters (the x in varchar(x)) can differ between sources and destinations for textual types. This is because a source may consider this parameter to represent the number of characters in a piece of text data, and the destination may consider it to mean the number of bytes in that piece of data. For example, non-Latin characters are often represented using UTF-8 encoding with one to four one-byte code units. The character $ requires one byte, the character £ requires two bytes, and the character € requires three bytes to encode in UTF-8. Thus, if your source considers each one a single character, you could parameterize the value $£€ with a length of 3, but if your destination considers each by its constituent bytes, it would need to be parameterized with a value of 6 (1 for $, 2 for £, 3 for €). Therefore, if parameter lengths differ between your source and destination for text types, it is not necessarily indicative of a problem. Furthermore, in Snowflake, there is no difference in either performance or storage between a varchar(16) and varchar(1024) column if every value is under 16 characters long.

Flatten and query JSON data

To flatten and query JSON data, do the following:

Convert strings containing JSON to a VARIANT type using Snowflake's Parse_JSON function.
Use Snowflake's FLATTEN function on the VARIANT data.
Use Snowflake's SQL Extensions to query the semi-structured data.
You can also use Fivetran's native Transformations to run the functions automatically against your JSON data and save the results in a new schema.

Table identifiers and aliases

Snowflake does not allow ANSI reserved keywords to be used as table names and aliases in a FROM clause.

When you query your Snowflake data warehouse, be sure to encase the table names and aliases in double quotes to avoid exception errors. For example, use "JOIN" instead of JOIN.

Column identifiers

Snowflake does not allow the following ANSI reserved keywords to be column names when they are capitalized:

CONSTRAINT
CURRENT_DATE
CURRENT_TIME
CURRENT_TIMESTAMP
CURRENT_USER
LOCALTIME
LOCALTIMESTAMP

To ensure a valid column name, Fivetran converts the keyword to lowercase and encases it in double quotes (for example, LOCALTIME becomes "localtime"). When you query these columns, be sure to use the converted column name.

Migrate databases

To migrate your Snowflake database that is configured with Fivetran to another database, perform the following steps:

Create variables for the role and database. Run the following commands:

set role_name = 'FIVETRAN_ROLE';
set database_name = 'FIVETRAN_DATABASE';

Grant the Fivetran role appropriate permissions on the new database. Run the following command:
```
GRANT ALL PRIVILEGES
ON DATABASE {$database_name}
TO ROLE {$role_name};
```
See our setup guide for more information.
Pause all your connections from the Fivetran dashboard. On the connection dashboard, set the connection status toggle to PAUSED.
Copy or move the schemas you want to sync from the existing Snowflake database to the new database.

Grant the FIVETRAN_ROLE the correct permissions on the database schemas and tables. Run the following commands:

set schema_name = <customer_schema_name>;
GRANT OWNERSHIP on SCHEMA {$schema_name} to FIVETRAN_ROLE REVOKE CURRENT GRANTS;`
GRANT OWNERSHIP on ALL TABLES IN SCHEMA {$schema_name} to FIVETRAN_ROLE REVOKE CURRENT GRANTS;
GRANT ALL on ALL TABLES IN {$schema_name} to FIVETRAN_ROLE;

Update the destination connection settings to specify the new database details. In the destination dashboard, click Actions and then click Edit connection details. Enter the new database name and provide the authentication details.

Create variables for the user and warehouse. Run the following commands:

set user_name = 'FIVETRAN_USER';
set warehouse_name = 'FIVETRAN_WAREHOUSE';

Change the default warehouse. Run the following command:

ALTER USER {$username}; 
SET default_warehouse = {$warehouse_name};

Enable your connections. On the connection dashboard, set the connection status toggle to ENABLED.

Data load costs

Fivetran uses SQL queries to merge data into the tables in your data warehouse. Because Snowflake bills for compute usage on a per-second basis, they charge you when Fivetran loads data into your data warehouse. To reduce costs, we recommend that you provision an X-Small data warehouse with one-minute auto-suspend for Fivetran to load new data into. Most Fivetran customers will be billed for 10 - 20% usage of an X-Small warehouse, though more frequent syncs will increase the usage.

Reduce Fail-safe cost

Registering Fivetran with a transient database will ensure all Fivetran-created tables are transient by default. This will help reduce the additional costs and storage space associated with Fail-safe. However, transient tables cannot be recovered in case of operational or system failures.

See Snowflake's Fail-safe documentation for more information.

Improve query performance

Add cluster keys

Fivetran doesn’t enforce cluster keys in the Snowflake tables because the data is well clustered in most situations. You can define a clustering key for a table if the table size has increased or you observe slow query performance.

To improve the performance of the Fivetran-triggered queries, such as MERGE, UPDATE, DELETE, add cluster keys on the columns used to join the existing and the incoming data. See Snowflake's Clustering documentation for more information on how to add clustering keys and understand the cost implications.

Adding cluster keys helps reduce the amount of data scanned for each query. For optimal performance, add cluster keys for the columns that are part of the table's primary keys. You can find the columns from the table’s schema in Snowflake or in log events synced to your destination by either our free Fivetran Platform Connector, which we automatically add to every destination you create, or external log services.

Collation

Fivetran does not recommend using custom collations as it can impact data integrity. You can change the collation using {<col_name>} COLLATE {<collation_of_your_choice>} in your downstream queries while sorting or comparing your data. For example:

    SELECT {<col_name>} FROM {<fivetran_replicated_table>}
    ORDER BY {<col_name>} COLLATE {<collation_of_your_choice>};

If you choose to use a collation, make sure the collation is case-sensitive.

Unstructured files Beta

By default, we store unstructured files in a named internal stage within the connection's destination schema. Alternatively, you can configure your destination to use an Amazon S3 external stage to store the unstructured files. In either case, we use the destination table name you specified when setting up the source connection as the stage name. We also maintain a Snowflake directory table and update it after every sync. For more information, see our Unstructured File Replication feature documentation.