Apache Kafkalink
Fivetran supports Apache Kafka as a destination.
Apache Kafka is an open-source distributed streaming platform for building real-time data pipelines. The Apache Kafka platform is based on a persistent, append-only, and publish-subscribe log system that captures and moves real-time data and events.
Supported implementationslink
Fivetran supports connecting with the following Kafka implementations:
Type transformation mappinglink
The data types in your Apache Kafka destination follow Fivetran's standard data type storage.
We use the following data type conversions:
Fivetran Data Type | Destination Data Type |
---|---|
BOOLEAN | BOOLEAN |
INT | INT |
LONG | LONG |
BIGDECIMAL | DECIMAL |
FLOAT | FLOAT |
DOUBLE | DOUBLE |
LOCALDATE | DATE |
LOCALDATETIME | TIMESTAMP-MILLIS |
INSTANT | TIMESTAMP-MILLIS |
STRING | STRING |
XML | STRING |
JSON | STRING |
BINARY | STRING |
Setup guidelink
Follow our step-by-step setup guides for specific instructions on how to set up Apache Kafka as a destination:
Data load costslink
Apache Kafka does not charge you extra when Fivetran loads data into your destination.
Destination data storagelink
Events in Kafka are immutable, meaning they cannot be modified or deleted. Consequently, we append all operations (upsert/update/delete) as new records in Kafka.
Example:
Consider the following initial records in your Kafka destination:
ID | Column2 | Column3 | _fivetran_op_type |
---|---|---|---|
1 | a | b | 0 |
2 | x | y | 0 |
3 | p | q | 0 |
Assume the following changes occur in your source:
- A new row with ID = 4 is inserted.
- Column2 of the record with ID = 2 is updated to 'z'.
- The row with ID = 3 is deleted.
After we perform a sync, the records in the Kafka destination appear as follows:
ID | Column2 | Column3 | _fivetran_op_type | _fivetran_updated_columns |
---|---|---|---|---|
1 | a | b | 0 | null |
2 | x | y | 0 | null |
3 | p | q | 0 | null |
4 | k | l | 0 | null |
2 | z | null | 1 | column2 |
3 | p | q | 2 | null |
Here, _fivetran_op_type
and _fivetran_updated_columns
are Fivetran system columns which indicate the following:
_fivetran_op_type
actively indicates the following operations depending on its value: 0 = upsert, 1 = update, 2 = delete._fivetran_updated_columns
actively indicates the updated columns for update operations. If multiple columns are updated, they are represented by a string of all the updated column names separated by a semicolon (';').
Column data type changeslink
To change the column's data type, Fivetran updates the schema in the schema registry with the new data type. Sometimes, the compatibility type of your topic may not allow us to update the schema. In such cases, set the compatibility type of your topic to NONE and modify the downstream queries to consume data from the updated schema.