Databricks

The Databricks Lakehouse Platform combines the key features of data lakes and data warehouses. It supports multiple data workloads including analytics, business intelligence, data engineering, data science, and machine learning. The platform is built on open source and open standards.

Supported implementations

Fivetran supports connecting with three different Databricks implementations:

Supported deployment models

We support the SaaS and Hybrid deployment models for all Databricks implementations.

NOTE: You must have an Enterprise or Business Critical plan to use the Hybrid Deployment model.

Setup guide

Follow our step-by-step setup guide to connect Databricks on AWS , Databricks on Azure, or Databricks on Google Cloud Platform to Fivetran.

Type transformation mapping

The data types in Databricks follow Fivetran's standard data type storage.

We use the following data type conversions:

Fivetran Data Type	Destination Data Type	Notes
BOOLEAN	BOOLEAN
SHORT	SMALLINT
INT	INT
LONG	BIGINT
BIGDECIMAL	DECIMAL or DOUBLE	If a column's precision exceeds `38`, we convert its data type to DOUBLE. Otherwise, we convert it to DECIMAL.
FLOAT	FLOAT
DOUBLE	DOUBLE
LOCALDATE	DATE
LOCALDATETIME	TIMESTAMP	Databricks requires time zone value
INSTANT	TIMESTAMP
STRING	STRING
JSON	STRING	Databricks doesn't support JSON
BINARY	BINARY

Column names

Fivetran ignores the case of column names in your destination tables as Databricks is case-insensitive.

Table maintenance

We perform weekly maintenance operations on the Delta tables. We run the following operations during the weekend:

NOTE: You may observe a sync delay for your connections while the destination table maintenance operations are in progress.

Liquid clustering

Fivetran does not support liquid clustering for Delta tables.

Transformations and Quickstart data models

Databricks on Azure destinations with OAuth authentication do not support Transformations for dbt Core* and Quickstart data models.

Unstructured files Private Preview

By default, we store unstructured files in a managed volume within the connection's destination schema. Alternatively, you can configure your destination to use an external volume to store the unstructured files. In either case, we use the destination table name you specified when setting up the source connection as the volume name.

* dbt Core is a trademark of dbt Labs, Inc. All rights therein are reserved to dbt Labs, Inc. Fivetran Transformations is not a product or service of or endorsed by dbt Labs, Inc.