01 Nov 2018 | Article

Build vs. Buy: Some Back-of-the-Envelope Costs

Why it's ill-advised to build your own data pipeline
Charles Wang
Charles Wang
Build vs. Buy: Some Back-of-the-Envelope Costs

Build vs. Buy

There is a good chance that your company is among the more than 150,000 global customers of Salesforce. Chances are also strong that you use services like Marketo, Zendesk, Jira, and Zuora to gain a comprehensive view of your business operations.

The research and advisory firm Gartner estimates that 70% to 80% of business intelligence efforts fail, in part because of outdated technology, clunky processes, and inaccessibility. The path to failure can be expensive, too, and the goal of this article is to present some of the costs in money, time, and anguish associated with building a bespoke business intelligence solution.

Building Your Own Connectors Is Complicated and Expensive

Suppose you use the five connectors listed above, and want to build connectors to automatically ingest the API endpoints and store the data in a data warehouse. Here are some slightly optimistic calculations for monetary costs:

Each of five connectors will take about five weeks for an engineer to build:

(5 connectors) * (5 personWeeks)

Based on what Fivetran has found through previous experience, each connector will likely need a dedicated week of maintenance work per quarter, adding up to four weeks per year:

(5 connectors) * (5 personWeeks + 4 personWeeks)

(5 connectors) * (9 personWeeks) = 45 personWeeks

That makes 45 weeks out of 52 weeks in a year. Multiply that fraction by a typical software engineer salary ($120,000) to see how much building your own connectors will cost during your first year:

(45 weeks/52 weeks) * (\$120,000) = $103,846.15

In subsequent years, your engineer will continue to update each quarter (four weeks) and handle bugs and edge cases as they crop up (one week), for a total of five person-weeks of work per connector.

(5 connectors) * (5 personWeeks) = 25 personWeeks

That makes 25 weeks out of 52 weeks in a year, all dedicated to ongoing maintenance. This is how much maintaining your own connectors will cost in subsequent years:

(25 weeks/52 weeks) * ($120,000) = $57,692.31

I won’t give the game away in terms of the pricing model for Fivetran connectors, but a yearly subscription to the five connectors listed earlier will cost far less than either of the figures above. These costs will, of course, scale in direct proportion to the number of data sources you use.

Manual Reporting Takes Too Long and Turns Your Analysts into Bottlenecks

The alternative to constructing a sophisticated and maintenance-intensive business intelligence and data science infrastructure is to assemble reports and analyses manually. An analyst from one of our customers estimated that their manual reports routinely took “a month,” or 160 hours of work.

Consider the following workflow:

120 hours
- Collect files manually (spreadsheets, CSVs, JSON files)
- Consult managers
- Wait on replies
- Run API ingestion scripts

40 hours
- Clean, format, and transform data
- Perform analysis
- Build visualizations
- Write report

= 160 hours

Such a lengthy workflow effectively limits the frequency of reports and findings, needlessly consumes an analyst’s time, and makes simple metrics inaccessible to the business users who need them to make decisions.

When our customer switched to Fivetran, the time commitment for each report shrank to less than a week, more than quadrupling the speed at which the company could make data-driven decisions. Another customer similarly lopped off 140 hours of work every week, and estimates that they gained a 200% ROI by switching to Fivetran.

Now, the workflow looks more like:

40 hours
- Clean, format, and transform data
- Perform analysis
- Build visualizations
- Write report

= 40 hours

Furthermore, there is now time to dedicate to more sophisticated analyses. The flow could be something like this:

60 hours
- Clean, format, and transform data
- Perform analysis
- Build visualizations
- Machine learning and statistical modeling
- Write report

= 60 hours

Those of us who are familiar with agile methodology, OODA loops, or competitive esports understand that it is always advantageous to make informed decisions more rapidly. Imagine only being able to act on data once a month!

Morale Is Hard to Quantify but Very Important

If you want to keep your analysts, engineers, and managers happy, you should consider the following problems associated with building your own connectors or manual reporting:

  1. Diversion from other software engineering, data science, or analytics duties — this is a very common irritant among new data scientists at understaffed organizations and leads to turnover
  2. Frustration and exhaustion from the complexity of maintaining data integrity, particularly by persons lacking the appropriate training
  3. Continually increasing complexity (and downtime) as additional sources of data are added
  4. Misguided decisions caused by lags between requests for business intelligence and delivery of actionable insights — insights that might be stale by the time they arrive

An engineer friend of mine jokes that “database maintenance” is the worst chore he has ever encountered in his career.

It isn’t a very funny joke.

Learning Curves Can Be Steep; Let the Experts Handle It

The five-week estimates I introduced earlier refer to APIs that are straightforward or user-friendly. But not all APIs are straightforward: some ignore best practices, some are poorly documented, and some are just very complex.

One of the most popular connectors Fivetran offers is the NetSuite connector. It took Fivetran six months to build out the initial version. The second iteration took a year, and the third took yet another a year. Only then did we have a truly mature, stable, and well-functioning piece of software.

Q4 2015
| 0.5 year
|
Q2 2016
| 1 year
|
|
|
Q2 2017
| 1 year
|
|
|
Q2 2018
= 2.5 years

The Netsuite connector is popular largely because so many companies attempting to use the API have been stymied by its complexity. The DIY approach to the NetSuite connector is not advisable. At Fivetran, on the other hand, we have people who have spent the last two and a half years thinking about how to crack this particular nut.

Make the Division of Labor Work in Your Favor

The division of labor is directly responsible for humanity’s greatest commercial, scientific, and technological accomplishments.

But many of the data engineering skills necessary to construct data pipelines are not formally taught in academic programs, boot camps, or training programs. It is scarce human capital that is often developed the hard and expensive way — through experience, trial, and error. People in adjacent roles — analysts, software engineers, and data scientists — often find themselves performing these duties poorly and against their druthers.

Given the value of labor specialization, there’s no reason for you and the thousands of other companies using Salesforce, Marketo, and other software to build your own API connectors when an off-the-shelf solution exists. Fivetran has already climbed/scaled the learning curve for you so that you can spend your time and energy building your core product and making sense of your operations.

Are You A Data Expert?

Get started with a free trial today.

Discover the smartest solution for data-driven results.
Offline. No network available. Check your connection and try again.