OneLake
OneLake is Microsoft Fabric's unified and logical data lake.
Supported deployment models
We support the SaaS Deployment model for the destination.
Setup guide
Follow our step-by-step OneLake setup guide to connect your OneLake destination with Fivetran.
Type transformation and mapping
The data types in your OneLake destination follow Fivetran's standard data type storage.
We use the following data type conversions:
Fivetran Data Type | Destination Data Type |
---|---|
BOOLEAN | BOOLEAN |
SHORT | SHORT |
INT | INTEGER |
LONG | LONG |
BIGDECIMAL | DECIMAL(38, 10) |
FLOAT | FLOAT |
DOUBLE | DOUBLE |
LOCALDATE | DATE |
INSTANT | TIMESTAMP |
STRING | STRING |
XML | STRING |
JSON | STRING |
BINARY | BINARY |
Supported query engines
You can use the following query engines to query your data from your OneLake destination:
- Azure Synapse Analytics (native application of Microsoft Fabric)
- Databricks
NOTE: Make sure Unity Catalog is not integrated with your Databricks workspace.
Data format
Fivetran stores your data in a structured format in the destination. We write your source data to Parquet files in the Fivetran pipeline and use Delta Lake format to store these files in the data lake.
Folder structure
We write your data to the following directory: <lakehouse_name>.lakehouse/Tables/<table_name>
or <lakehouse_guid>/Tables/<table_name>
Table maintenance operations
We perform the following maintenance operations on the Delta Lake tables in your destination:
- Delete old snapshots and removed files: We delete the table snapshots that are older than the Snapshot Retention Period you specify in the destination setup form. However, we always retain the last 4 checkpoints of a table before deleting its snapshots. We also delete removed files. Removed files are the files that are not referenced in the latest table snapshots but were referenced in the older snapshots. These removed files contribute to your OneLake subscription costs. We identify such files that are older than the snapshot retention period and delete them. This cleanup process runs once daily.
- Delete orphan files: Orphan files are created because of unsuccessful operations within your data pipeline. The orphan files are stored in your destination but are no longer referenced in the Delta Lake table metadata. These orphan contribute to your OneLake subscription costs. We identify such files that are older than 7 days and delete them every alternate Saturday.
NOTE: You may observe a sync delay for your connectors while the table maintenance operations are in progress.
Limitations
Fivetran creates the DECIMAL columns with maximum precision and scale (38, 10).
Spark SQL and SparkR queries cannot read the maximum values of DOUBLE and FLOAT data types.
SparkR queries cannot read the minimum and maximum values of LONG data type.
Spark SQL and SparkR query truncate the timestamp values to seconds. To query any table using a TIMESTAMP column, you can use
unixtime(unix_timestamp(<col_name>, 'yyyy-MM-dd HH:mm:ss.SSS'),'yyyy-MM-dd HH:mm:ss.ms')
clause in your queries to get the accurate data, including milliseconds or microseconds. You can also use PySpark or Spark Scala to get accurate values.The table and schema names must not start or end with an underscore, and must not contain multiple consecutive underscores ( __ ).
Fivetran does not support the Change Data Feed feature for Delta Lake tables. You must not enable Change Data Feed for the Delta Lake tables that Fivetran creates in your OneLake destination.
Fivetran does not support the
archive access tier
because its retrieval time can extend to several hours.