Data, Development, and Analytics: A Look Ahead
How is the data landscape driving changes in how data engineers and IT teams manage the data and how analysts seek to derive value from it? We asked Sean Knapp, founder and CEO of Ascend.io, for his thoughts about what's ahead.
- By James E. Powell
- November 5, 2019
Upside: What technology or methodology must be part of an enterprise's data strategy if it wants to be competitive today? Why?
Sean Knapp: The data landscape is following patterns similar to other major technology domains. Over the past 10 years, for example, we've seen DevOps leverage automation to deliver speed and flexibility within the software development life cycle. Now, we're seeing automation being used across more development areas.
In data pipelines, for example, the traditional, manual, and repetitive process of pipeline creation is far too time-consuming and only results in brittle pipelines and high maintenance costs. Applying automation to the data development life cycle frees data engineers to focus on other work that is critical to the business and brings stability, speed, and flexibility to data architectures regardless of the inevitable changes in data type and applications.
What one emerging technology are you most excited about and think has the greatest potential? What's so special about this technology?
It's been really exciting to see the rise of Kubernetes and how it's fundamentally transformed how companies can develop and run applications, especially in an increasingly multicloud and hybrid world. This move to microservices architectures unlocks a lot of value for businesses, enabling greater flexibility through automation and protection from vendor lock-in. It's why Ascend opted to use Kubernetes when we built our elastic data fabric, resulting in on-demand processing to handle near-limitless scale and full portability across cloud environments so our customers get the same experience no matter where they run.
What is the single biggest challenge enterprises face today? How do most enterprises respond (and is it working)?
Managing a greater variety of large-scale, ever-changing data is the new normal. It's not sufficient to simply manage this data, however. Data engineers and IT teams must enable an ever-increasing number of people to do increasingly powerful things with it as well. These accelerating complexities make it challenging to orchestrate the movement of data across the enterprise. Most organizations rely heavily on manual engineering development to translate what the business actually needs from the data into low-level tasks with hard-coded, pre-defined rules and triggers. This slows down overall development and becomes untenable to maintain as data and downstream user needs continue to grow and change.
This current model is ill-equipped to provide the speed and flexibility required for digital transformation and leaves already overworked data engineering teams to constantly battle a legacy code base while the business moves on without them.
Is there a new technology in data and analytics that is creating more challenges than most people realize? How should enterprises adjust their approach to it?
Without a doubt, the shift to the cloud has unlocked powerful benefits of ease and flexibility for enterprises. However, the promise of on-demand, always-available resources and systems isn't without its challenges. With so many cloud services to choose from, it can be daunting to understand which are actually necessary for the project at hand. Although these managed services may be quick to spin up in isolation, stitching them together to deliver end-to-end use cases is a major (and ongoing) engineering effort. Each service has its own preferences and tuning idiosyncrasies that need to be reconciled to ensure efficiency, stability, and data accuracy.
To address this, enterprises need to evaluate services within the context of their overall use case and how it can help to deliver end-to-end value. This can help to protect against failure and skyrocketing cloud costs as data changes, dependencies grow, and the interconnectedness becomes increasingly complex.
What initiative is your organization spending the most time/resources on today?
At Ascend, we strive to be as data driven as possible, especially when it comes to understanding our customers' usage and behaviors. We leverage our own deployment of our Autonomous Dataflow Service to easily and quickly collect and analyze product logs, telemetry data, user activity, and more. By making sense of this massive amount of data, we have been able to make better decisions on new feature development and fixes, understand where to focus resources to drive higher usage, and preempt support and troubleshooting needs.
Where do you see analytics and data management headed in 2020 and beyond? What's just over the horizon that we haven't heard much about yet?
There have been many recent advancements and innovations in data management, including large-scale processing engines and cloud-based data warehouses, but the most powerful systems of today are still being controlled and orchestrated by some of the most rudimentary technologies. How data moves and is transformed is still dictated by manual, hard-coded triggers and rules, resulting in slow development cycles and brittle pipelines. Taking inspiration from other technology designs (such as Kubernetes and React), we'll begin to see a move away from manual, task-based design and scheduling to intelligent orchestration that leverages data-centric algorithms and automation. This shift will lower the overall design and maintenance costs across the data development life cycle while improving the quality and reliability of resulting data pipelines.
Describe your product/solution and the problem it solves for enterprises.
Ascend developed the world's first Autonomous Dataflow Service, enabling data engineers to more efficiently build, scale, and orchestrate data pipelines. Ascend runs natively in the public cloud and combines declarative configurations and automation to manage the underlying infrastructure, optimize pipelines, and eliminate maintenance across the entire data life cycle.
[Editor's Note: Sean Knapp is the founder and CEO of Ascend.io and was a co-founder, CTO, and Chief Product Officer at Ooyala. Sean scaled Ooyala to 500 employees, oversaw products and engineering, and defined Ooyala's product vision for their award-winning analytics and video platform solutions. Before founding Ooyala, Sean worked at Google where he was the technical lead for Google's legendary Web Search Frontend team. Sean has B.S. and M.S. degrees in Computer Science from Stanford University.]
About the Author
James E. Powell is the editorial director of TDWI, including research reports, the Business Intelligence Journal, and Upside newsletter. You can contact him
via email here.