Data scientists and engineers at the online data science education platform focus on attribution, customer lifetime value and the ideal customer profile.
To avoid the hassle of building and maintaining a pipeline internally, DataCamp added Fivetran to its data stack. During a two-week proof of concept, the data engineering team was able to configure the connectors, work through issues with support, and get Fivetran up and running smoothly. The data science team now has access to reliable data and can conduct critical analyses of customer lifetime value, attribution and the ideal customer profile.
Destination: Postgres RDS Warehouse
Business Intelligence Tool: Custom tool
DataCamp was founded in 2013 with a mission to democratize data science education. It now offers more than 340 online courses on Python, R, SQL, data engineering, data visualizations, and more. Gaëtan Podevijn joined DataCamp in 2018 as its first data engineer, working on both the Data and Infrastructure and Data Science Teams to establish the data infrastructure and ensure alignment between data scientists and engineers.
DataCamp uses a number of SaaS applications to increase efficiency throughout the business, but valuable data was siloed within those applications. Podevijn explains the problem:
We need the data to monitor the business, from financial health to marketing campaign efficiency sales, and this data helps us make critical business decisions both in the short and long term. You could, in theory, build the pipeline to get the data into your data lake or warehouse yourself. But if you’re a data engineer like me, you know that ingesting sources is not an easy task.
He shares some of the challenges with building a pipeline internally:
For each source, you have to read the docs, learn how the API works, write the code, and move the data to your destination
Each source has its own API
Pipeline maintenance, including monitoring, notifications and fixing breakages
Knowing the challenges involved in building internally, DataCamp was confident that an automated solution like Fivetran was the right choice. During a two-week proof of concept (POC), Podevijn was able to configure the connectors and work through issues with support to get Fivetran up and running smoothly:
Both DataCamp and Fivetran have a similar culture: We want to provide not only the best platform but the best interaction with sales and support teams. We had a great experience during the POC, which led to a successful onboarding and full implementation.
Now, Fivetran automatically syncs the data, detects API changes, and provides notifications for issues, freeing up time for more critical tasks like security and increasing data team productivity.
With all of its data centralized, DataCamp is able to join sources and generate impactful business insights. Below are a few examples:
Customer Lifetime Value
DataCamp leverages data from Zuora, its subscription management service, to compute its B2C customer lifetime value (LTV). Through analysis, a data scientist found that LTV is higher for users who first subscribe to an annual plan, as opposed to those who first subscribe to a monthly plan, and is also higher for users in the US, as opposed to other high-income countries. The dashboard, which is useful for finance, marketing and sales, displays additional splits such as users who subscribed during a promotion.
Ideal Customer Profile
In addition to individual subscriptions, DataCamp for Business is available for groups of five or more who would like to upskill their teams at scale. On the Enterprise plan businesses can easily integrate with their SSO/LMS provider, create custom learning paths and assignments and measure the impact of online training via performance dashboards. Companies using DataCamp achieve course completion rates 6X higher than traditional online course providers. Data science and machine learning engineers at DataCamp developed a model that predicts which companies are most likely to benefit from the platform. The model was trained on data coming from Fivetran, and sales teams use it for targeting.
Facebook Ads and Google Ads are two important advertising sources for DataCamp. The maximum attribution window for Facebook is 28 days, compared to 90 days for Google. DataCamp needed a larger window for Facebook data to understand attribution as Podevijn explains:
It's very possible that the amount of time that passes from the point that users click on a Facebook ad, register for DataCamp, and complete a free course to the point they decide to subscribe is greater than 28 days. One of the data scientists conducted a study on the effect of a longer attribution window on Facebook Ads and, by combining data from Fivetran with internal data, showed that we could significantly increase the margin of our customer acquisition target because there are acquisitions that occur that aren’t counted by Facebook due to its small attribution window.
About Fivetran: Shaped by the real-world needs of data analysts, Fivetran technology is the smartest, fastest way to replicate your applications, databases, events and files into a high-performance cloud warehouse. Fivetran connectors deploy in minutes, require zero maintenance, and automatically adjust to source changes — so your data team can stop worrying about engineering and focus on driving insights.