Data and AI Predictions for 2024: Shifting Sands in Enterprise Data and AI Technologies
From data lakehouses and generative AI to controlling cloud migration costs, data professionals will have their hands full in 2024.
- By David P. Mariani
- January 5, 2024
As enterprises continue to evolve their data infrastructure and explore the burgeoning landscape of generative AI, 2024 is set to be a year of strategic realignment and measured adoption. Here are three themes I expect to define the technological trajectory of the coming year.
Prediction #1: Enterprises will cautiously explore, but not wholeheartedly embrace generative AI
The promise of generative AI is immense, and its potential applications have captured the imagination of technologists worldwide. In 2024, interest in generative AI will remain high, but actual IT investments will be modest. Concerns over data residency, security, and the complexities of operationalizing such cutting-edge technology will prompt a conservative approach.
We can expect to see generative AI applications largely confined to enhancing customer service through more sophisticated chatbots and providing support to professionals as copilots in various industries. Meanwhile, large language models (LLMs), a critical component of generative AI, will likely remain under the purview of major cloud vendors such as OpenAI, with its market-leading offerings. Google Bard and Amazon's bet on Anthropic will strive to capture more of the market, yet they will largely be playing catch-up to OpenAI's first-mover advantage.
The in-house development of LLMs by individual enterprises will be rare; the expertise, resources, and data required to train and maintain such models are beyond the reach of all but the largest tech giants. This will keep the power of LLMs centralized among a few key players, with OpenAI positioned to dominate the space through 2024.
Prediction #2: Cloud migration will be viewed through a cost-conscious lens
As organizations continue their shift to cloud-based data and analytics infrastructure, a more prudent fiscal outlook will be the theme for 2024. The cloud migration megatrend will not reverse, but organizations will scrutinize their cloud spend more than ever due to the challenging macroeconomic environment.
In the cloud analytics arena, Databricks and Snowflake will continue their dominance with their well-established platforms. In particular, Databricks’ first-mover advantage for facilitating a lakehouse architecture will allow it to capture more market share. This paradigm combines the flexibility of data lakes with the management features of data warehouses, offering the best of both worlds to enterprises.
On the other hand, Google BigQuery is expected to retain its stronghold within Google Cloud Platform (GCP) deployments, bolstered by deep integration with other GCP services and a strong existing customer base.
However, the economic headwinds will compel enterprises to consider the total cost of ownership more closely. As a result, the traditional data warehouse architecture will see a decline in favor of the more cost-effective lakehouse design pattern. This pattern, which aligns well with the current move towards a decentralized and flexible approach to data management, will gain momentum as organizations seek to optimize their cloud spending while still harnessing the power of big data and advanced analytics.
Prediction #3: Microsoft Fabric and the data lakehouse movement will grow in popularity
With the continuous evolution of data management strategies, more organizations are expected to consider Microsoft Fabric – what the company calls an all-in-one analytics solution -- as a contender when deploying an enterprise-grade data lakehouse. The allure of Microsoft Fabric lies in its promise to seamlessly integrate with the Microsoft ecosystem, specifically leveraging DirectLake technology to revamp data analysis processes by enabling real-time insights and analytics directly on top of the data lake, thus phasing out the more cumbersome Power BI imports.
However, despite the appeal, enterprises are likely to encounter challenges with the relative immaturity of Microsoft Fabric's infrastructure. Given the complexity and the need for robust, battle-tested systems, businesses may continue to lean on established players such as Databricks and Snowflake. These technologies have a proven track record and are familiar territories for many data teams. Microsoft Fabric will gain attention, but the stickiness of existing solutions will help them retain a significant market share throughout 2024.
Measured and Cautious Investments in 2024
Looking ahead to next year, the narrative is not about the decline of any particular technology but rather about adaptation and strategic investment. Microsoft Fabric will emerge as a strong competitor in the data lakehouse arena but will have to prove its maturity to win over enterprises. Generative AI will continue to fascinate, but enterprise adoption will be a story of careful steps rather than giant leaps. Finally, as the cloud becomes the default setting for enterprise data and analytics, the emphasis will be on getting more for less -- an endeavor that will favor flexible, cost-effective lakehouse architectures over their more traditional counterparts.
As always, the future is not written in stone, but in data and code.
About the Author
Dave Mariani is the founder and chief technology officer of AtScale. Prior to AtScale, Dave ran data and analytics for Yahoo!, where he pioneered the development of Hadoop for analytics and created the world's largest multidimensional analytics platform. He also held the position of CTO for Bluelithium, where he managed one of the first display advertising networks delivering 300M ads per day powered by a multiterabyte behavior targeting data warehouse. Dave is a big data visionary and serial entrepreneur. You can contact the author at LinkedIn.