How to Build a DataOps Team: 3 Key Team Functions
Before you assemble your DataOps team, you must identify the functions members will perform. First part of a two-part series.
- By Mark Marinelli
- February 11, 2019
Imagine what you could accomplish if users in your organization had high-quality data at their fingertips that they didn't need to prepare themselves. What if your organization could answer questions such as "Who are our suppliers?" consistently and completely? Even answering simple questions such as this can result in a competitive edge.
However, if your organization is like most, you've been deploying systems for business process automation and treating the data generated from these deployments as a byproduct rather than as a business asset. It's time to transform this data into a competitive advantage as giants such as Google and Amazon have. The way to accomplish this is with data operations (DataOps).
The question is where to begin. If you haven't read Andy Palmer's Upside interview about DataOps, do so now. It's a terrific introduction and overview of DataOps, covering why it's gaining traction, how it's evolving, and why it's so important in large enterprises. Palmer also addresses key components, challenges, examples of successful implementations, and what's in store for DataOps over the next three to five years.
The DataOps Team
The next looming question is how to build a DataOps team because people are a vital cornerstone of the DataOps equation. A high-performance DataOps team rapidly produces new analytics and flexibly responds to marketplace demands. They unify the data from diverse, previously fragmented sources and transform it into a high-quality resource that creates value and enables users to gain actionable insights.
We'll start by identifying the key functions in the DataOps team. If you're versed in data management, the functions may sound familiar, but the DataOps methodology calls for different skill sets and working processes than have been traditionally employed, just as the DevOps approach to software development restructured existing development teams and tooling. The focus on agility and continuous iteration necessitates more collaboration across these functions to build and maintain a solid data foundation amid constantly shifting sources and demands.
Using supplier data as an example, let's look at three of these functions.
Data Supply
Who owns our internal supplier management systems? Who owns our relationships with external providers or supplier data? The answer to these questions, typically found in the CIO's organization, is the data source supplier, of which you probably have dozens. As we transition from views and SQL queries to data virtualization and APIs, from ERDs and data dictionaries to schemaless stores and rich data catalogs, the expectations for discoverability and access to these sources have increased, but requirements for controlled access to sensitive data remain. In a DataOps world, these source owners must work together across departments, and with data engineers, to build the infrastructure necessary so the rest of the business can leverage all data.
A great data supplier can confidently say: Here are all the sources that contain supplier data.
Data Preparation
Effective data preparation requires a combination of technical skill to wrangle raw sources and business-level understanding of how the data will be used. DataOps expands traditional preparation beyond the data engineers who move and transform data from raw sources to marts or lakes to also include the data stewards and curators responsible for both the quality and governance of critical data sources that are ready for analytics and other applications. The chief data officer is the ultimate executive owner of the data preparation function, ensuring that data consumers have access to high-quality, curated data.
A great data preparation team can confidently say: Here is everything we know about supplier X.
Data Consumption
On the "last mile" of the data supply chain, we have everyone responsible for leveraging unified data for a variety of outcomes across analytical and operational functions. In our supplier data example, we have data analysts building dashboards charting aggregate spending with each supplier, data scientists building inventory optimization models, and data developers building supplier 360 portal pages.
Modern visualization, analysis, and development tools have liberated these data consumers from some of the constraints of traditional BI tools and data marts. However, they still must work closely with the teams responsible for providing them with current, clean, and comprehensive data sets. In a DataOps world, this means providing a feedback loop so that when data issues are encountered, they aren't merely corrected in a single dashboard but are instead communicated upstream so that actual root causes can be uncovered and (optimally) corrections can be made across the entire data community.
A great data consumer can confidently say: Here are our actual top 10 suppliers by annual dollars spent, now and projected into next year.
A Final Word
High-functioning DataOps teams are poised to redefine what is possible in data analytics. Teams have the opportunity to enable high-value, high-visibility breakthroughs, delivering significant value to the organization. They also include careers that are in high demand and are rewarding.
In part 2 of this series we'll discuss how to staff these functions with the people you have and where they are necessary, as well as where and when to bring in new talent to round out your DataOps organization.
About the Author
Mark Marinelli is head of product with Tamr, which builds innovative solutions to help enterprises unify and leverage their key data. A 20-year veteran of enterprise data management and analytics software, Mark has held engineering, product management, and technology strategy roles at Lucent Technologies, Macrovision, and most recently at Lavastorm, where he was chief technology officer.