TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Modernize and Govern: Unifying Your Data Strategy July 10, 2025
  - Expert Panel: Best Practices for Modernizing Your Data Environment July 14, 2025
  - Powering Data Science with AI-Driven Tools and Practices July 15, 2025
  - Smarter Marketing in Retail: How AI and Modern Data Foundation Drive Growth July 17, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - TDWI Data Governance Principles and Practices: Managing Data as an Asset June 25, 2025
  - Building Your Company’s Data Governance Roadmap June 25, 2025
  - Data Governance: Driving Engagement and Organizational Change June 26, 2025
  - A Framework for Modern Data Governance June 25, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Executive Q&A: Getting the Most from Unstructured Data

Enterprises are still struggling to mine the wealth of information contained in unstructured data. What are they doing right and wrong? Edward Cui, founder of Graviti, shares his perspective.

By James E. Powell
January 5, 2022

If you're analyzing only structured data, you're missing a wealth of insights. Edward Cui, founder of Graviti, explains how to access the value in phone calls, emails, and even social media posts.

For Further Reading:

Why Do We Call Text "Unstructured"?

3 Use Cases for Unstructured Data

A Case for Managing Data Uniquely for Each Form of Advanced Analytics

Upside: What are some of the most common types of unstructured data enterprises aren't using in their analytics?

Edward Cui: Enterprises have been using structured data for analysis including transactional data, master data, and analytical data. However, over 80 percent of enterprise data is now unstructured. This includes emails, images, recordings, videos, text files, PowerPoint presentations, and social media data. It is obvious that this large volume of unstructured data contains business insights and value that haven't been mined yet.

For improving customer experience, phone calls, online chats, emails, or even comments from social media accounts can provide key values for better understanding customer sentiment. By analyzing this data, you will be able to know what customers like or don't like about your brand, product, or service, and eventually increase marketing effectiveness.

For enterprise management, unstructured data can be analyzed to improve employee productivity, deliver business intelligence, and promote innovation.

After so many years of recognizing the value of unstructured data, and with so much unstructured data being collected, enterprises still find it a challenge to incorporate it into their analytics. What are the challenges they face?

Unstructured data cannot be easily stored in a traditional column-row database like a spreadsheet. Because unstructured data comes in different formats such as videos, images, and phone calls, it does not have a unified standard to analyze. Even when enterprises transform unstructured data to structured data, they must rely on artificial intelligence to analyze it and to achieve their goals. Most enterprises don't have the budget or time to develop such a tool.

Another challenge is that the volume of unstructured data grows too fast, with an estimate showing the volume increases about 55-65 percent per year. No matter whether enterprises choose to manage the data online (uploaded to the cloud) or offline (on an enterprises' local server), the cost of AI could be challenging.

What progress has been made in the last 10 years in using unstructured data? Why hasn't more progress been made?

The world of ten years ago was dominated by structured data. After 2012, though, as sensors became cheaper, cell phones gradually became smartphones, and cameras were installed to make shooting easier. With this, a large amount of unstructured data was generated, and enterprises entered uncharted territory, making progress slow. Some of the inhibitors to progress in this area include:

Complexity: Unlike structured data which can be analyzed intuitively, unstructured data needs to be further processed and then analyzed, usually best done through artificial intelligence. Machine learning algorithms classify and label content from it. However, it is not easy to identify high-quality data from the data set due to the large amount and complexity of unstructured data -- this has been painful for developer teams and a key challenge to data architectures that are already complex.

Cost: Although the enterprise recognizes the value of unstructured data, the cost can be a potential obstacle to making use of it. The cost of enterprise infrastructure, human resources, and time can hinder the implementation and development of AI and the data it analyzes.

What progress has been made in analyzing unstructured data in the last decade -- or has there been no progress at all? Are enterprises still stuck not being able to analyze it because they don't have the tools?

As noted, unstructured data didn't play a significant role in the last decade due to its complexity and data expansion. Enterprises could not abstract high-quality data from existing data sets without a fitted AI training model. However, a great example of progress made is Data Version Control (DVC), an open-source version control system for machine learning projects, launched on GitHub in 2017. DVC is built to make machine learning models shareable and reproducible. It is designed to handle large files, data sets, machine learning models, and metrics as well as code.

What recommendations or advice can you give enterprises that want to get started analyzing unstructured data? How should they begin? What best practices will make their job easier?

We recommend that enterprises have a fully prepared plan in place before they start analyzing unstructured data. Because the amount of unstructured data grows rapidly, enterprises must consider these questions clearly before collecting data: where to store the data (in the cloud or locally); how to identify high-quality data; how to develop the training model and iteration with newly collected data, and so on. Additionally, artificial intelligence professionals can help enterprises figure out the questions (and answers) before they begin collecting unstructured data for analyzing.

What kinds of insights can unstructured data reveal?

Rich information can be dug up from unstructured data and the values vary across industries. For example, we've recently partnered with a provider of intelligent logistics in the supply chain industry. The company provides AI monitoring services for streamlining warehousing operations. Based on the data collected, the client then leverages computer vision to identify and help organize the inventory. This data also helps predict and plan logistics well in advance, and improves the productivity, accuracy, and efficiency of production.

Another example comes from the automotive industry. Connected cars could receive direct product feedback from user interactions. Such unstructured data can be processed and analyzed for product planning, automobile development, quality improvement, manufacturing, and, of course, customer satisfaction.

How is Graviti working to make unstructured data easier to use?

Graviti aims to launch the first data platform that enables organizations to work with large volumes of unstructured data to power innovative AI applications. This platform eliminates the hassle and helps developers manage large amounts of unstructured data with the team.

Although most of the available information in AI development is low quality and unstructured, development teams usually spend over half of their time not on building models but rather on identifying, augmenting, or cleansing unstructured data, and that's just the beginning of their work. Graviti offers a more expert data management way to free developers and gives them more time to analyze unstructured data and train artificial intelligence models. We help developers in three dimensions: data discovery, data iteration, and workflow automation.

About the Author

James E. Powell is the editorial director of TDWI, including research reports, the Business Intelligence Journal, and Upside newsletter. You can contact him via email here.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Executive Q&A: Getting the Most from Unstructured Data

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Executive Q&A: Getting the Most from Unstructured Data

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career