How to Organize Your Analytics Team

The following is a guest piece by Erik Jones, Director of Product Management, Analytics & Strategic Planning at New Relic, who has authored this piece as part of our Data Champions program. If you're interested in contributing to or learning more about our Data Champions program, please get in touch.

“Big Data” has been a buzzword for a decade now, but analytics initiatives routinely fall short of expectations. One Gartner report predicts that, through 2022, only 20% of analytic insights will deliver business outcomes. Another Gartner study suggests analytic projects fail at a rate close to 85%. New Vantage Partners, in their 2019 Big Data and AI Executive survey, tell a similar story - 69% of survey respondents say their organizations are not yet data-driven; an even higher 72% say their firms have yet to develop a data culture. And yet investments in data, business intelligence and analytics continue to increase at an estimated 30% CAGR from 2020 to 2023. In 2019, the global analytics market was $50bn, more than double the value four years ago. At current success rates, that’s a loss of $42bn on a $50bn investment.

There are numerous reasons for this dismal track record. A google search for “why analytics projects fail” returns 14m+ results. I want to focus on one that tends to get overlooked, but is foundational to the success of a meaningful data culture - team structure. Over 90% of the respondents in the New Vantage Partners survey identified issues with people and processes as an obstacle to evolving a culture that embraces and is driven by data. A proper organization of your analytics team can mitigate this challenge.

The problem: Forensic analytics

Stop me if this sounds familiar: Why does customer churn in the monthly ARR report from Salesforce not match customer churn from the retention report in Anaplan? Any analytics lead knows this all too well as the prelude to what I call forensic analytics: the two (or more) teams who pulled customer churn reports will now spend days, sometimes weeks, reconciling data sources, report filters, business logic, and timing to determine the source of reporting incongruity. More often than not, the result is a “bridge” between the two reports because full reconciliation isn’t possible, despite the numerous person-hours put into determining differences. This exercise is then usually repeated on a regular basis as each siloed report becomes integrated into a business unit process.

This direct cost of forensic analytics is quite high - weeks’ worth of hours spent investigating differences between reports. The opportunity cost of forensic analytics might be even higher; time spent on reconciliation is time NOT spent on informing decisions, strategic planning, analytic model development or process automation. Furthermore, the resources dedicating time to forensic analytics - quantitative analysts, data scientists, data engineers - are expensive and doing work they are neither uniquely qualified to do, nor want to do in the first place. There is a reason analytic professionals, despite being in the sexiest job of the 21st century, spend 1-2 hours per week looking for a new job.

Fundamentally, as identified by 90%+ of the New Vantage Partners survey respondents, forensic analytics is a people and process issue. Companies rarely lack data; neither do firms lack tools or technology. Instead, organizations often lack the necessary analytic team structure to (1) best enable a data driven culture, and (2) realize the full potential, and ROI, of analytic capabilities. Exploring the differences between potential structures will make clear why purposefully choosing an organizational strategy is one of the first and most important decisions an analytics leader can make.

The cause: Poor analytics team structure

It is useful to think of analytic team structure on a spectrum. At one end is the decentralized (embedded) model. Here, each business unit has fully dedicated resources; think Finance vs. Sales vs. Product vs. Customer Success, each with their own analytics teams dedicated to and embedded within the function. Responsiveness to the business and fast turn-around times are the largest perceived benefits of the fully embedded model. The embedded approach is a common starting point for many companies as one or two functions at first - usually Marketing and Sales - have more expressed analytic needs than other groups.

Though decentralization is considered the holy grail of “data democratization”, it is rarely effective. Rather than generating insights from deep data analysis, embedded analytics teams primarily spend time doing three things:

Moving and transforming data between applications – Most meaningful questions involve more than one data source. For example, the Product team wants to know the distribution of Monthly Active Users (MAU) across customer revenue bands. This requires both a product engagement data source (for MAUs) and a Sales data source (for revenue bands). The product analytics team will need to figure out how to extract and merge revenue bands with their product engagement data. Far too often, embedded teams manage these ETL processes in a manual, ad-hoc and non-scalable manner.
Duplicative work – At the same time the Product team is asking about MAUs across revenue bands, the Sales team is asking about user engagement patterns at customers currently on free trial. This requires the sales analytics team to extract and merge product usage data with their Sales and CRM system. Unfortunately decentralized teams rarely abstract away from the immediate task and end up doing duplicative work. Both the Sales and Product teams are asking for the same thing, namely merging engagement and sales data.
Reactive analytics – Responsiveness is good, until it’s not. Impactful analytics teams need to balance support for quick turn-around requests with the longer production cycles required for deep analysis, infrastructure development and insight generation. Because decentralized teams are usually measured against and incentivized on responsiveness, their time is highly skewed towards reactionary projects at the expense of more meaningful strategic and data infrastructure work.

At the other end of the spectrum is a fully centralized team. Here, all quantitative analysts, data engineers and data scientists report into a central analytics hierarchy, with responsibilities spanning the organization. The primary benefit of this structure is distance between the business and the analytics group. Time and resources are managed to develop technical expertise and modeling capabilities, as opposed to minimizing response time between business question and answer, as with embedded teams. These teams are typically staffed by subject matter experts in computer science, applied math, statistics, and machine learning. Centralized functions can work well in analytically mature organizations, with the time, patience and money to fund what is essentially an internal research capability.

Centralization does incur tradeoffs, however, which makes this type of team structure impractical for all but the most mature, analytically developed companies:

The speed, quality, cost conundrum – Do you want your analytics fast and cheap? Fast and good? Or cheap and good? A centralized function can be fast and good, rarely cheap and good, and never fast and cheap. Data engineers / scientists / analysts are not cheap resources. Good data engineering / scientists / analysts are highly coveted and therefore expensive. Furthermore, timelines for a true data science project (read machine learning, artificial intelligence) are measured in quarters, not days, weeks or months.
Siloed career development – the deeper technical expertise is developed, the more likely self-selection will reinforce a centralized model. Analytic generalists tend to prefer varied career experiences and seek opportunities to move across functions and responsibilities. A centralized team can silo career development within specific technical areas and limit growth potential outside the analytics group.
Removal from the business – This is the biggest tradeoff. Centralized analytics teams lack the full business context of problems they are trying to solve, often by design. Requirements gathering, understanding business impact and iterative development all take time to do well. That time comes at the expense of building out data pipelines or cross validating model methodologies. Because a centralized structure places focus on technical development, the full business impact of the group is often not realized.

The cure: Hub-and-spoke analytics

A hub-and-spoke structure consists of a core team of data engineers scalably extracting data from source applications, database engineers landing source data in a centralized data warehouse, and business analysts transforming source data into user-friendly (relational) tables and views for use in reporting, analysis and modelling. Data pipelines, permissioning / access, and transformation / aggregation logic are governed centrally in this analytics “hub”. Data scientists should also be a part of this hub, building technical models on top of these robust data pipelines.

Also building on top of this data architecture are the “spokes” - functionally aligned analytic teams composed of a lead and 2 or more analysts. For example, Finance has a Financial Analytics team lead, and 2 or more financial analysts dedicated to the Finance group. Similarly for Sales, Marketing, HR, Customer Success, etc. Each functional group should have a team lead focused on developing strategic capabilities and analytic thought leadership in support of informed decision making, and a set of analysts servicing rapid-response and ad-hoc requests. Both the hub and spokes report into the same senior analytics leadership, which ideally has C-Suite level representation in a Chief Analytics Officer (CAO) or Chief Data Officer (CDO).

A hub-and-spoke model resolves many of the issues seen with centralized or decentralized teams. Specifically:

Alignment – Unlike the decentralized model, responsibility and accountability for data infrastructure, data governance, and database development and design are aligned to the team - data and database engineers - most uniquely qualified to get it right.
Comprehensive expertise – Unlike the centralized model, the hub-and-spoke structure enables development of business and technical expertise. Functional leads and analysts are valued for an intimate understanding of their business area and the decisions analytics should inform. They are decision scientists - consulting, influencing strategy and advising decision-making from an analytical perspective. Data and database engineers and data scientists are valued for their ability to abstract away from a specific business issue and create extensible data and modeling architecture.
Balance – Functional leads should be focused on strategic work and becoming thought leaders for their partners. Analysts should provide responsiveness to the business. An analytics group only responding to ad-hoc business questions will never grow beyond a support function. On the other hand, lack of responsiveness will only frustrate business stakeholders. A hub-and-spoke model allows for capacity to balance strategic and support work.
A meaningful career ladder – Professional development can be overlooked in analytics, and any group that does so will eventually lose their best employees. The hub-and-spoke model allows for development up (from an analyst to a functional lead), across (rotations in multiple functional areas, or between analytics and data engineering / database engineering / data science) and out (transition directly into a business unit). This third one is crucially important - the impact of analytics will only be fully realized when you have analytics team members spread throughout the company in non-analytics decision making roles.
Objectivity – Perhaps the most valuable benefit of the hub-and-spoke model. Given their alignment with business areas, “spokes” are responsible and accountable as functional subject matter experts. Likewise, the “hub” is responsible and accountable for enterprise data architecture. Reporting into a dedicated analytics leader allows both the hub and spokes objectivity in their analyses, research, and recommendations.

In summary, the hub-and-spoke model works because your analytics team is able to provide both responsiveness and objective strategic leadership to your business partners, because cost and time are minimized by using the right tools and having them managed and implemented by the right team, and because your team is engaged and motivated by purposeful and meaningful career development opportunities.

Conclusion

The right team structure is necessary to realize full ROI from your analytics capabilities. Let’s revisit the forensic analytics example at the beginning of this post. Database and data engineers are extracting source data and maintaining architecture of a single, centrally governed data warehouse; analysts are designing query-able data marts for each functional group; functional leads are developing and aligning on shared, transparent and known business logic and KPI definitions with their business partners. Consequently, customer retention reporting is automatically refreshed and up-to-date, and consistent results are being shared widely throughout the organization because data is both governed and democratized. The weeks previously spent on forensic analytics are now reallocated to answering why churn increased/decreased during the month, and more importantly, what to do about it. Only now, with the right team structure, are you able to realize analytics ROI.

The problem: Forensic analytics

The cause: Poor analytics team structure

The cure: Hub-and-spoke analytics

Conclusion

Start for free