Sometimes, it’s interesting to capture the history of how an attribute has changed over time. Examples include a wine rating, a baseball player’s batting average, the scheduled departure time of a flight, and so on. In the analytics world, we know this as a slowly changing dimension or SCD.
Yet current-state tables, like those typically belonging to a production database or a SaaS product, reflect a type 1 SCD implementation, meaning that values are continually being overwritten as new data comes in. Such tables possess not so much a short-term memory as a one-level-deep memory. This is because, most of the time, applications need to recall only the latest truths. They may perform brilliantly, round hundreds of corner cases with aplomb, delight and distribute, power and push; most of them, however, remain amnesiacs.
That’s okay for machines, but what about for actual people working to satisfy human business requirements? Well, for analysts, tables with a one-level-deep memory lead to frustration. Want to count customers by account type? No problem, point your query at this MySQL table! But forget that time dimension in your chart, since said table doesn’t retain previous account type values. Out of sight, out of mind.
As analysts, we’re not always answering the how-are-things-now. We’re also called upon to answer the how-we-got-to-now, in order to draw more solid inferences about our journey thus far, as well as about the future. As a Fivetranner—and as a former historian to boot—I claim to offer no impartial opinion on the subject, but I’m pretty excited about our new “History Mode” feature. So what does it do?
History Mode turns your type 1 SCD implementation into type 2, which is to say that a new row is created to accommodate any changed values. This new row then becomes the active row for a given primary key, while the outdated values are consigned to an inactive row.
Too abstract? Not to worry. In practice, History Mode turns your data warehouse into a veritable time machine, rather than a dinghy adrift in an eternal present. Want to see how a user’s favorite content type has changed over time? Your app may not care, but you do. With History Mode, you’ve got that. Looking for the exact moment when that hard-to-upsell customer switched from basic to premium? Turn on History Mode and you’re covered. What about those pesky mappings whose state changes your engineers never instrumented? Check.
Those of us who have ever gone looking for contextual data of this kind, only to return empty-handed, know the downright emotional response that the experience can evoke: nervousness and disappointment; a sense of loss or even failure; the feeling that one’s analytical house is not in order. Don’t get me wrong: more comprehensive instrumentation is always a good thing. But History Mode can help when this isn’t possible, and for more straightforward use cases it may stand in for instrumentation entirely.
Historically, as it were, periods of rupture such as our own have done more than just shake things up. They’ve also tended to accelerate and deepen structural transformations that were already happening. This will prove no less true for the ascent of data in the daily operation and core strategy of businesses in the twenty-first century. Logistics, monitoring, machine learning, and analytics predicated on the reliable and comprehensive aggregation of data will better position businesses—and their customers—to weather the storms of an increasingly unpredictable world.
Data is transient, and the questions we ask of data become more viable to the extent that we build safeguards capable of counteracting that transience. Data with a digital memory means more useful data. Don’t abandon your data in the land of forgetfulness, blissfully though your app may dwell there. Activate your data with a link to the past, and in so doing you’ll tie your business more tightly to the future.
Learn more about History Mode in the Fivetran Documentation.