OneLakelink
OneLake is Microsoft Fabric's unified and logical data lake.
Setup guidelink
Follow our step-by-step OneLake setup guide to connect your OneLake destination with Fivetran.
Type transformation and mappinglink
The data types in your OneLake destination follow Fivetran's standard data type storage.
We use the following data type conversions:
Fivetran Data Type | Destination Data Type |
---|---|
BOOLEAN | BOOLEAN |
SHORT | SHORT |
INT | INTEGER |
LONG | LONG |
BIGDECIMAL | DECIMAL(38, 10) |
FLOAT | FLOAT |
DOUBLE | DOUBLE |
LOCALDATE | DATE |
INSTANT | TIMESTAMP |
STRING | STRING |
XML | STRING |
JSON | STRING |
BINARY | BINARY |
NOTE: Fivetran stores hex-encoded BINARY values in your destination. You can use the unhex function,
decode(unhex(colNAme), 'UTF-8')
, in your queries to fetch the decoded BINARY values.
Supported query engineslink
You can use the following query engines to query your data from your OneLake destination:
- Azure Synapse Analytics (native application of Microsft Fabric)
- Databricks
NOTE: Make sure Unity Catalog is not integrated with your Databricks workspace.
Data formatlink
Fivetran stores your data in a structured format in the destination. We write your source data to Parquet files in the Fivetran pipeline and use Delta Lake format to store these files in the data lake.
Folder structurelink
We write your data to the following directory: <lakehouse_name>.lakehouse/Tables/<table_name>
.
Limitationslink
Fivetran does not support history mode for OneLake destinations.
Fivetran creates the DECIMAL columns with maximum precision and scale (38, 10).
Spark SQL and SparkR queries cannot read the maximum values of DOUBLE and FLOAT data types.
SparkR queries cannot read the minimum and maximum values of LONG data type.
Spark SQL and SparkR query truncate the timestamp values to seconds. To query any table using a TIMESTAMP column, you can use
unixtime(unix_timestamp(<col_name>, 'yyyy-MM-dd HH:mm:ss.SSS'),'yyyy-MM-dd HH:mm:ss.ms')
clause in your queries to get the accurate data, including milliseconds or microseconds. You can also use PySpark or Spark Scala to get accurate values.The table and schema names must not start or end with an underscore, and must not contain multiple consecutive underscores ( __ ).