AWS MSK link
Updated November 16, 2023
AWS Managed Streaming for Kafka is a managed distributed streaming platform.
|Custom data||check||All tables and fields|
|Data blocking||check||Column level|
|Re-sync||check||Connector level. If there is a retention period set for records, we will not be able to fetch records beyond the retention period.|
|API configurable||check||API configuration|
|Fivetran data models|
Follow our step-by-step AWS MSK setup guide to connect AWS MSK with your destination using Fivetran connectors.
Fivetran creates one table for each topic.
IMPORTANT: You can choose which topics to sync on the Schema tab in your Fivetran dashboard.
For each table it creates
key columns where
offset are the primary keys. The
timestamp column may contain either
log_append_time as per the server configuration.
You can select to sync
unpacked messages. For the
packed messages, Fivetran syncs the message in
value column. The
unpacked messages must be in
JSON format. For all the first level
JSON elements, Fivetran creates a separate column. The column names are formed using
For the Avro message type, we sync the data as
unpacked messages by default. For the values, we sync each element with the column name format
value_<element_name>. If the key is also serialised using an Avro schema, we sync the elements in the key with the column name format
key_<element_name>. If the key is not serialised using an Avro schema, we sync it as a string.
After making the connection, Fivetran starts syncing all available messages from the Kafka topics. It goes to the earliest available offset for each partition of a topic and starts consuming the messages. It loads the messages into the warehouse. After the retention period the messages are deleted from the Kafka topics. The deleted messages won't be synced so if you happened to resync the connector it would only fetch the current available messages.