The Difference Between Monthly Active Rows and Total Synced Rows
Fivetran recently moved to a consumption-based model leveraging monthly active rows (MAR). We believe this is a better way for our customers to maximize the value of Fivetran. Here’s why.
Active Rows and Pipeline Efficiency
First, let's define "monthly active row." The two main components of MAR are:
- Rows at Rest: Rows at rest is the total number of primary keys in the data source
- Update Rate: Update rate is the percent of primary keys in the source that are updated or added at least once in a single month. This is typically 10-20%.
A row becomes active when added to or updated in a data destination like a data warehouse. We only recognize a row as active once in a month period, not each time it’s updated. This means you’re not charged for multiple updates to a row in a single month.
We’re able to price based on monthly active rows due to Fivetran connectors being designed to efficiently capture changes in the data source and perform incremental upserts wherever possible. This ends up being 10-100x smaller than the total synced rows you’ll see from typical pipelines. This can ultimately reduce the cost of managing a cloud destination since only necessary data, which needs to be replicated, will be brought over.
Total Synced Rows and Update Waste
We learned some time ago that monthly active rows are not the same as what our customers see in their pipelines for total synced rows. This is because a typical pipeline will experience waste, which happens when a row that wasn't updated is repeatedly synced in a few ways:
- Multiple Row Updates: A single row, defined by a unique primary key, can be updated multiple times in a single month. Rows will undergo updates several times over the course of a month. Each update counts as a synced row. This generally occurs 5x/month on average.
- Snapshot Waste: This happens when a primary key that wasn't actually updated is synced (e.g., when you replicate a table using snapshots). Capturing updates is hard and many often resort to a snapshot approach, syncing all rows every time. This generally occurs 10-20x/month on average.
Over the course of a month, or even years, you can see how much waste a typical data pipeline generates because it was never built to handle incremental changes effectively.
Calculating MAR vs. Total Synced Rows
Now that we know the difference between MAR and total synced rows, we can finally calculate just how different they can be with the same amount of data.
To calculate MAR:
Total Rows at Rest * Update Rate % = MAR
To estimate your total synced rows:
(Average # of Row Updates * MAR) + (# of Monthly Snapshots * Total Rows at Rest) = Total Synced Rows
What if we were to take a database of 10M rows at rest and compare the different counts?
Monthly Active Rows:
10,000,000 X 10% = 1M
Total Synced Rows:
(5*1,000,000)+(10*10,000,000) = 105M
Pay for What You Use
Now that you know how to estimate your MAR and how it differs from total synced rows, you can more accurately predict the price of using Fivetran by estimating how many credits you will need. If you need help with your estimate, don’t hesitate to set up time with a product specialist! We’d love to hear your feedback on the new model and start a partnership today.