OneLake

OneLake is Microsoft Fabric's unified and logical data lake.

Supported deployment models

We support the SaaS Deployment model for the destination.

Setup guide

Follow our step-by-step OneLake setup guide to connect your OneLake destination with Fivetran.

Type transformation and mapping

The data types in your OneLake destination follow Fivetran's standard data type storage.

We use the following data type conversions:

Fivetran Data Type	Destination Data Type	Notes
BOOLEAN	BOOLEAN
SHORT	SHORT
INT	INTEGER
LONG	LONG
BIGDECIMAL	DECIMAL(38, 10), DOUBLE, or STRING	If a primary key's precision exceeds `38` or scale exceeds `10`, we convert its data type to STRING. If a column has no precision or scale defined, we convert its data type to DOUBLE. For all other columns, we convert the data type to DECIMAL(38,10).
FLOAT	FLOAT
DOUBLE	DOUBLE
LOCALDATE	DATE
INSTANT	TIMESTAMP
STRING	STRING
XML	STRING
JSON	STRING
BINARY	BINARY

Supported query engines

You can use the following query engines to query your data from your OneLake destination:

Azure Synapse Analytics (native application of Microsoft Fabric)
Databricks
Make sure Unity Catalog is not integrated with your Databricks workspace.

Data format

Fivetran stores your data in a structured format in the destination. We write your source data to Parquet files in the Fivetran pipeline and use Delta Lake format to store these files in the data lake.

Folder structure

We write your data to the following directory: <lakehouse_name>.lakehouse/Tables/<table_name> or <lakehouse_guid>/Tables/<table_name>

Table maintenance operations

We perform the following maintenance operations on the Delta Lake tables in your destination:

Delete old snapshots and removed files: We delete the table snapshots that are older than the Snapshot Retention Period you specify in the destination setup form. However, we always retain the last 4 checkpoints of a table before deleting its snapshots. We also delete removed files. Removed files are the files that are not referenced in the latest table snapshots but were referenced in the older snapshots. These removed files contribute to your OneLake subscription costs. We identify such files that are older than the snapshot retention period and delete them. This cleanup process runs once daily.
Delete orphan files: Orphan files are created because of unsuccessful operations within your data pipeline. The orphan files are stored in your destination but are no longer referenced in the Delta Lake table metadata. These orphan contribute to your OneLake subscription costs. We identify such files that are older than 7 days and delete them every alternate Saturday.

You may observe a sync delay for your connections while the table maintenance operations are in progress.

Limitations

Fivetran creates the DECIMAL columns with maximum precision and scale (38, 10).
Spark SQL and SparkR queries cannot read the maximum values of DOUBLE and FLOAT data types.
SparkR queries cannot read the minimum and maximum values of LONG data type.
Spark SQL and SparkR query truncate the timestamp values to seconds. To query any table using a TIMESTAMP column, you can use unixtime(unix_timestamp(<col_name>, 'yyyy-MM-dd HH:mm:ss.SSS'),'yyyy-MM-dd HH:mm:ss.ms') clause in your queries to get the accurate data, including milliseconds or microseconds. You can also use PySpark or Spark Scala to get accurate values.
The table and schema names must not start or end with an underscore, and must not contain multiple consecutive underscores ( __ ).
Fivetran does not support the Change Data Feed feature for Delta Lake tables. You must not enable Change Data Feed for the Delta Lake tables that Fivetran creates in your OneLake destination.
Fivetran does not support the archive access tier because its retrieval time can extend to several hours.