Our new feature helps you implement Type 2 slowly changing dimensions for your historical database analytics with no coding needed.
Our goal has always been to develop fully automated and managed data pipelines for every data source so that your analytics workflows can run uninterrupted from the initial, brief setup to ongoing updates to ensure that your data remains fresh. From connector-level updates for performance and API updates, to platform-wide improvements spanning from increased regional compliance, to triggered downstream transformations in the warehouse run in dbt or SQL, we’ve been intent on ensuring that your integration tool supports your desired business outcomes.
That’s why we’ve built our database connectors with features such as:
Automated schema drift handling. Schema changes such as table and column additions or data type changes commonly either break data pipelines or require maintenance to accommodate the change. With Fivetran, these changes are automatically pushed to your data warehouse so that your teams can stay up to date with the latest business processes and tracked metrics.
Log-based replication. To maintain the highest level of data integrity, we connect to the transaction logs of each database connector to capture updates made to your data. This lowers the typical level of overhead associated with traditional extract queries.
We’ve also recently expanded our list of data sources supported by history mode to include our database connectors: Oracle, SQL Server, MongoDB, DynamoDB, PostgreSQL and MySQL.
History Mode is our solution for unlocking historical analysis of your data sources. We create a record of every change in your database transaction logs and append those records to tables you specify so that you can track the changes made to your data over time.
The usual term used to describe what History Mode tracks is Type 2 Slowly Changing Dimensions (SCD), and the usual method to achieve this in lieu of native support in the data integration tool is to build it yourself. In a previous post, we dove into considerations when building this yourself, but in short, the two primary issues that you’ll come across when scripting Type 2 SCD yourself are:
Data integrity checks to ensure that you’re capturing every transaction made to your table, along with timestamps, to have a reliable history of data to draw upon for insights
Additional checks to capture deleted data and determine when values are no longer valid
In the UI, this new feature can be enabled by selecting “history mode” for specific tables from the Schema tab of your database connector(s). That’s all you need to do to immediately start capturing the change record of your data!
Fivetran will extract the timestamps associated with transactions in your database to create additional rows that denote the values associated with each row during their life cycles. An example of what this looks like can be found below!
In this example, we’re looking at a table that’s capturing how an order has changed over time. The order remains open until the purchase has gone through, and it looks like we have an indecisive customer who continues to add to their cart.
With History Mode enabled, in addition to each change to the row, 3 additional columns are added to help you track the lifecycle of each row. “fivetran_start” and “fivetran_end” denote when a row was current. The last column added is “fivetran_active”, which is a boolean column where FALSE means that the row is no longer the most up to date version, and TRUE indicates that the row is the most current version. Timestamps and changed records are extracted from the logs themselves and are not impacted by a connector’s update frequency.
In the case of Salesforce, it’s apparent that there are a few universally shared use cases for historical analysis of Salesforce data, but the primary difference in database historical analysis is that the type of data tracked in each database is determined by individual business models. Some use cases for historical tracking on databases are:
Tracking user journeys through product line and feature usage that can be used for future product roadmaps or customer health indicators
Tracking the progression of an order including changes users have made to their orders as well as logistics involved with manufacturing and distribution
“It’s going to save us a bunch of time in regards to reporting, lookbacks, debugging product problems...we would’ve been able to track a gap in the application tracking that would be instrumental in patient-provider relations that much earlier” - Brian Moore, Lead Data Engineer, CirrusMD
“Understanding from an analytics perspective...that can all be helped by this history tracking”- Parag Shah, Director of Data Architecture, Rocket Software
“History Mode can allow our software developers to focus on what they do best, engineering fast transactions on our platform, without having to worry about how to capture data changes that occur within those transactions. The business systems team now has access to ephemeral, changing datasets effortlessly. We're incredibly pleased with this new option” - Beth Mattson, Data Engineer, Field Nation
If you’re a current customer, then you'll need to reach out to us to enable this feature for your account. From there, you can follow the remaining steps. If you already have an existing connector, you can log into your dashboard, select the connector, navigate to the “schema” tab of your chosen connector, and then use the dropdown menu to the right of each table name to enable History Mode. If not, you can start by creating a new database connector and then following the instructions above.
If you’re considering Fivetran for your data integration needs, we offer a free 14 day trial. This trial doesn’t begin until after you’ve already finished your first initial, historical sync of a chosen data source. With this recent History Mode Launch, there’s no better time to sign up and experience seamless integration built to power analytics.