TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
  - TDWI Digital Dialogue | AI Governance in Practice: Operationalizing Governance for Enterprise AI
  - TDWI Checklist Report | Building Trusted Data and AI Governance in a Regulated World
  - TDWI Digital Dialogue | Practical Strategies for Data Governance Deployment
- Webinars
  - Your Data’s in the Cloud – Now Make It Work for Your Business June 4, 2025
  - Empowering Data Teams with DevOps: Automation Strategies for Modern Analytics June 6, 2025
  - Expert Panel: Integrating Your Data and AI Platforms June 16, 2025
  - Developing Your Data Strategy and Foundation for Modern Data Management – Insights from a New TDWI Best Practices Report June 23, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - ML Bootcamp // Hands-On: Machine Learning with Python Made Easy - No, Really!April 30, 2025
  - ML Bootcamp // Hands-On: Text Analytics with Python Made Easy!April 30, 2025
  - ML Bootcamp // Hands-On: Data Wrangling with Python Made Easy!April 30, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Is a Semantic Data Plane the Answer to Poor Data Management?

With so many obstacles to a successful data management strategy, can a semantic data plane make a difference?

By Bharti Patel
April 22, 2024

In spite of -- or perhaps because of -- the decades-long shift to hybrid computing and distributed cloud architectures, the jaw-dropping hardware and software improvements, and the breakneck pace of developments in generative AI, poor data management is still commonplace. In fact, results from the recent TDWI Data Management Maturity Assessment show that, although 71% of IT experts agreed their organization values data, only 19% said a strong data management strategy was in place, and close to half (45%) said their strategy wasn’t communicated. What’s going on here?

For Further Reading:

Critical Components for Data Fabric Success

5 Steps to Implementing a Modern Data Fabric Framework

Data Fabric: How to Architect Your Next-Generation Data Management

A truly promising answer to improve data management has emerged, but not from some vaguely defined AI solution. Rather it comes from the semantic data plane -- a data fabric for structured, unstructured, and metadata that supports data virtualization, fast distributed query processing, and local data transformations -- which may leverage AI to improve efficiencies but is primarily designed from the ground up to resolve core data issues and their complexities.

Put the Business First

Above all else, an effective data management system must solve business problems and drive business value. That’s its raison d'être. However, despite knowing this, data management in practice is often not designed to serve business use cases quickly.

A system must be natively flexible enough to handle new kinds of use cases that emerge. Teams prioritizing business problems and use cases are asking: What kind of data is required to tackle the problem in front of me? What kind of data access is needed and by whom? How fast can I get the data I need to swiftly pivot to solving an unexpected or new business problem tomorrow, next month, or next year?

A semantic data plane prioritizes business problems by allowing users to focus on the content and meaning of structured and unstructured data instead of having to figure out how to connect to different storage locations, how to deal with different data formats, and how to move data from one source’s system to another. All data sets and objects are virtualized. If, for instance, a user creates BI queries, reports, and dashboards on data sets, the queries will continue to work if those data sets are moved.

Semantic search allows us to locate relevant documents quickly, based on the meaning of the documents, and advanced retrieval augmented generation (RAG) pipelines make it easy to submit queries and get summaries and translations of data without having to be familiar with generative AI or having to identify specific documents in the first place.

Because the semantic data plane supports structured data, unstructured data, <em>and</em> metadata, a business user can easily “connect the dots.” For instance, a query about a specific customer could lead to a sales report, contracts, and invoices. The same tool (a chat-based interface) can provide answers in natural language or provide tables or charts depending on the source of the data or the requested output format.

Clear the Path to the Right Data

Access to the right data is a non-negotiable requirement in the design of modern data management systems regardless of where data resides, what format it’s in, whether it’s structured or unstructured, or how it is stored, moved, or migrated. When data is poorly managed, there’s no direct access to the latest clean data by those who need it most. A company may have loads of data, but much of it is dark -- that is, data collected and stored that’s not being used for business purposes. In fact, large organizations never use about half of all data they store -- an average of 17 PBs, according to Hitachi Vantara’s Modern Data Infrastructure Dynamics Report.

In an AI-focused world, large language models and customized, smaller models need the right data access, too. If models train on problematic data, they’ll deliver problematic responses, such as biased information and hallucinations. You may have trillions of parameters, yet large swaths may be unusable. What constitutes the right data changes over time as well? Data decays and stale data yields erroneous results. Data access can also be hampered by movement from on-premises storage to the cloud or vice versa, or from system to system via different data pipelines. When data moves, systems frequently break and data copies multiply exponentially.

Because a semantic data plane provides an abstraction layer to data, business users can access data in the same way no matter where the data is located. A user authenticated and authorized to access data can access that data using a consistent API across disparate data sources. For example, for relational data, access can be provided via an efficient API such as Apache Arrow Flight/Arrow Flight SQL, or ADBC (Arrow Database Connectivity). For legacy clients, the much slower JDBC and ODBC APIs can be supported. Again, domain experts do not need to know where the data is physically located as long as data access is fast across the hybrid cloud environments that support their work. Similar APIs are available for unstructured data.

For Further Reading:

Critical Components for Data Fabric Success

5 Steps to Implementing a Modern Data Fabric Framework

Data Fabric: How to Architect Your Next-Generation Data Management

Reduce Excessive Duplication

Enterprises struggle mightily with having multiple copies of the same data. As their data grows, so does their data duplication -- and the money wasted on storing and maintaining copies. Whether knowingly or not, enterprises retain these multiple copies, either because they often are not aware of what copies exist and where they live or because even when they get a handle on all this extant data, they often haven’t yet implemented ways to mitigate the problem. Data lakehouses have attempted to solve this problem by bringing data warehouse functionality directly to object storage and data lakes, but their approaches to versioning and their integration with distributed enterprise systems vary in efficacy.

The semantic data plane further mitigates the multiple copy problem by maintaining data lineage for all data sets (and files/objects) that are duplicated or moved. This allows an organization to quickly determine how one data set was derived from other data sets. Plus, adding decentralized semantic search capabilities means local document embeddings can be created that can be compared globally. By performing a similarity search, users can determine whether documents contain the same information.

Improve Integration of Heterogeneous Systems

Another fundamental problem is that business systems are often not well integrated so two different business units may be using different tools and solutions. Sometimes, even within the same business unit, there will be different systems that have not been integrated. Data might exist in different and inconsistent formats. As a result, data lives in different silos that aren’t interoperable. If systems are particularly complex -- as in the case of systems cobbled together from multiple vendor solutions and tools -- integration will be especially difficult.

Although technical teams often do have (or can learn) the skills needed for integration projects, finding the time, resources, and bandwidth are often the greater challenges. Business users are even worse off in these scenarios. AI may one day be able to help with some of these integration and documentation processes, but we’re not there yet. Rather, it’s architecture and tools that abstract away these integration challenges that will have the most immediate impact.

A semantic data plane serves this end by providing data virtualization and distributed query processing across disparate data sources. Even when joining data sets across heterogeneous systems, a business user does not need to know where data is physically located. There is no integration problem because all structured data sets (e.g., database tables, Parquet files, JSON files, CSV files, ORC files) are automatically converted into a highly efficient columnar data format that is also its own serialization format. This way, data can be efficiently transferred without any additional serialization and deserialization, which, in turn, supports joins of any kind of structured data across disparate data sources.

The Semantic Data Plane Puts Us Nearer to the Answer

It’s worth paying close attention this year to how data management through the semantic data plane, AI-powered processes in data systems, and open source technologies leading to a “sixth data platform” will all unfold and collide. Perhaps we’re closer to understanding data than ever before, which makes tackling poor data management easier than ever before.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Is a Semantic Data Plane the Answer to Poor Data Management?

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Is a Semantic Data Plane the Answer to Poor Data Management?

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career