Documentation

Documentation

  • Getting Started
  • Core Concepts
  • Using Fivetran
  • Usage-Based Pricing
  • Connectors
  • Applications
  • Databases
  • Files
  • Events
    • Amazon Kinesis Firehose
    • Apache Kafka
    • AWS MSK
    • Azure Event Hubs
    • Azure Service Bus
    • Confluent Cloud
    • Heroku Kafka
    • Segment
    • Snowplow
    • Webhooks
  • Functions
  • Destinations
  • Partner-Built
  • Transformations
  • Logs
  • Security
  • REST API
  • Local Data Processing (HVR 6)
  • Release Notes
RSS
Release notes RSS
HVR 5 Documentation
HVR 5 Documentation
  • Support
  • Sign In
Edit on GitHub

Eventslink

Updated November 1, 2023

Event tracking measures consumer behavior on a website, app or emails, usually through pixel tracking. Fivetran integrates with several services that collect events sent from your website, mobile app, or server and then loads these events into your destination.


Supported serviceslink

Fivetran offers support for the following event tracking libraries:

  • Amazon Kinesis Firehose
  • Apache Kafka
  • AWS MSK
  • Azure Event Hubs
  • Confluent Cloud
  • Heroku Kafka
  • Snowplow Analytics (open source)
  • Segment
  • Webhooks

See our Swagger documentation for more information about webhook endpoints.


Sync overviewlink

After instrumenting the tracking code on your website, server, or mobile application, Fivetran's event pipeline will collect, enrich, and load all of this data into your destination in near real-time. In addition to being easy to set up, Fivetran is built to scale to hundreds of millions of events per day, and automatically retain a secure backup of your event data. If your destination is ever compromised, we can easily reload all your events for you.

The following diagram outlines the process for event collection at Fivetran:

Events Overview

When we receive the events in our collection service, we first buffer the events in the queue and store them in our cloud storage buckets before writing the events to a temporary file. We push the events (the temporary file) to the destination when the sync runs. For more information on the data retention period of different connectors, see our Data retention period documentation.

NOTE: We do not sync the events that are in our queue or are getting stored in the storage buckets while a sync is in progress. We process these events during the next sync.

Supported regionslink

By default, we store that event data in a cloud storage service in one of the following locations:

  • EU region (for destinations run in the EU region)
  • UK region (for destinations run in the UK region)
  • US region (for all other destinations)

NOTE: For the Webhooks connector, you can configure Fivetran to store event data in a bucket you manage.

Event collector serverslink

Our collection service uses regional collector servers to receive events and store them temporarily. We expose the collector servers using the webhooks.fivetran.com sub-domain and use Amazon Route 53 as our Domain Name System (DNS) server.

We support geolocation routing for the requests to the webhooks.fivetran.com sub-domain. We route the requests to the appropriate regional collectors using the location the DNS queries originate from. We store and process your data within the data processing location you select in the destination setup form.

For example,

  • Data processing location is US. When we receive a request to webhooks.fivetran.com, our collection service routes the requests to the US collector server. We store your data in a storage bucket in the US location.

  • Data processing location is EU. When we receive a request to webhooks.fivetran.com from a client in the US, our US collector server collects the request and then stores your data in a storage bucket in the EU location.

  • When we receive a request to webhooks.fivetran.com originating from outside the supported regions, we route the requests to the US collector server.

We support Handshake requests if your source requires a response with some validation data from our collector servers before establishing the connection or creation of webhooks. Please refer to the individual connector docs to see if it's supported for the connector you are using.

Data retention periodlink

Fivetran retains event data from the Webhooks connector and other connectors that use webhooks. We store that data so that it can be re-synced if needed. The data retention period depends on the connector type.

ConnectorData Retention PeriodNote
AppsFlyer30 days
Eloqua30 days
Github30 days
Greenhouse30 days
Help Scout30 days
HubSpot30 days
Intercom30 days
Iterable30 days
Jira30 days
Pipedrive30 days
Recharge30 days
BranchPersistent
MandrillPersistent
SegmentPersistentIf you chose to use webhooks instead of S3 bucket
SendGridPersistent
ShopifyPersistent
SnowplowPersistent
SurveyMonkeyPersistent
WebhooksPersistent

Updating datalink

Fivetran takes the new data and appends it to the existing tables. When we encounter events that cannot be parsed (such as incomplete data) we skip those events and alert the user.

Fivetran is always collecting data and stores it long term in a staging S3 bucket. The changes are then loaded on a 15-minute interval by default. This upload time can be changed inside of your Dashboard.

Fivetran does not propagate deleted data for events, because the events have already happened.


Schema changeslink

Fivetran does not propagate schema changes. If you stop tracking a metric, the metric will have a null value for that row. All the historical data would still exist. If you stop tracking a table, the table will still exist, but you won't get any new rows. It's not possible to delete a row in the source, because the event has already occurred.

Webhooks connection mappinglink

Fivetran supports a webhook integration for POST-ing arbitrary data directly into your destination. Webhook connections occur at the table level. Each different connection that is created within Fivetran (shown below in the blue connection icon circle) is given a specific URL to post data to. Any data posted to that URL will be added to a single destination table.

home fivetran webhook integration overview


Other event tracking services connection mappinglink

Fivetran supports a few pre-existing tracking services, namely Segment and Snowplow. Each different connection that is created within Fivetran is given a specific URL to post data to (shown below in the blue connection icon circle). You can use either Segment or Snowplow tracking services and send all of the events to your a specific Fivetran URL. Differing from the generic webhooks connection, any data posted to that URL will be automatically normalized into a set of destination tables within the same schema.

home fivetran event tracking overview

Naminglink

For Webhooks, you can name destination table in the Dashboard while creating the connector. Fivetran will create the new destination table for you automatically, and it's best not to have that table already created. You can also designate whichever schema you would like the table to reside in. If the schema that you select does not already exist, Fivetran will automatically create it for you in your destination.

For Segment or Snowplow, you can name the destination schema in the Dashboard while creating the connector. Within your destination schema, we will create a set of default tables for you in your destination.


XML format supportlink

Fivetran supports XML format messages for the following connectors:

  • Apache Kafka
  • AWS MSK
  • Azure Event Hubs
  • Confluent Cloud
  • Heroku Kafka
  • Azure Service Bus

To sync XML messages, select the Message Type as Text in the connector setup form. We sync XML data to a TEXT data type column in your destination.

You can parse the XML data from the TEXT column using destination-native methods.

NOTE: Standard limitations such as the maximum data size allowed in TEXT columns for different destinations applies to the data.

Parse XML from TEXTlink

Snowflakelink

You can parse and access the XML data by using a combination of the following Snowflake functions:

  • PARSE_XML
  • CHECK_XML
  • TO_VARIANT
  • XMLGET

Sample Query

WITH xml_variant AS (
  SELECT to_variant(parse_xml(<xml_text_column>)) variant 
  FROM <xml_text_table>
  WHERE check_xml(<xml_text_column>) IS NULL
)
SELECT get(xmlget(variant, <tag_name> , [ <instance_num> ]), '$') from xml_variant
content_copy

Questions?

We're always happy to help with any other questions you might have! Send us an email.

    Thanks for your feedback!
    Was this page helpful?