Leaders in every business are excited about bringing cutting-edge data science to their industry, and data science is the most talked-about career of the last 10 years. But many businesses hire data scientists before they are ready, and these highly paid team members end up spending their time doing basic data integration and reporting.
What does it take to be ready to hire data scientists? To answer this question, we need to understand the data hierarchy of needs:
- Top of the pyramid: data science
- BI and analytics
- Clean, curated, data
- Base of the pyramid: an enterprise data warehouse
The base of the pyramid is a solid enterprise data warehouse. Before you can do anything else, you need to get all your data in one place. You should use a modern cloud-based data warehouse, and it should contain a faithful replica of all the data in all your business systems that is always up to date. By taking a replication-first approach, you will have a single system that can support all your data questions today and in the future.
The second level of the pyramid is a clean, curated view of all this data. Real-world data is messy, and the same concepts are often duplicated between systems. For example, you may have data about your customers in both your CRM system and your accounting system, and there may be small contradictions between these systems. As part of the data-cleanup process, you'll decide which system is the system of record for each concept. Your analysts will write SQL queries that resolve these contradictions and convert the "raw tables" delivered by your data pipeline into a simplified view of your data that will provide the foundation for everything else you do.
The third level of the pyramid is classic business intelligence and analytics. These are the spreadsheets and dashboards that provide day-to-day decision support for your managers and leaders. Data science may be getting more attention these days, but traditional business intelligence is still the foundation of using data to make decisions. This type of work is also done by analysts, whose primary tools are SQL and visualization tools like Tableau or Looker.
The tip of the pyramid is data science. If you've hired well and done a good job building the lower levels of the pyramid, your data scientists will use their specialized skills in advanced statistics and modeling, leveraging the data integration and cleanup that has already been done by your analysts.
Think of your data scientists as the star athletes of your data team. Like the starting lineup of the Golden State Warriors, they have highly specialized skills and are supported by a much larger cast of teammates. A typical NBA basketball team will have five starters, 10 other players on the roster and thousands of total employees in the organization. Data scientists are like your starting lineup — you wouldn't want Steph Curry answering phones in the front office, and you shouldn't have your data scientists doing traditional analyst work. They're perfectly capable of doing this work, but it's not the right way to use your most valuable players.
So before you hire a team of star data scientists, make sure you have built the foundation that will make them productive.
A version of this blog post was originally published on Forbes Tech Council.