TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
  - TDWI Digital Dialogue | AI Governance in Practice: Operationalizing Governance for Enterprise AI
- Webinars
  - Creating an AI-Ready Organization – Results of New TDWI Best Practices Research June 27, 2025
  - Modernize and Govern: Unifying Your Data Strategy July 10, 2025
  - Expert Panel: Best Practices for Modernizing Your Data Environment July 14, 2025
  - Powering Data Science with AI-Driven Tools and Practices July 15, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - TDWI Data Governance Principles and Practices: Managing Data as an Asset June 25, 2025
  - Building Your Company’s Data Governance Roadmap June 25, 2025
  - Data Governance: Driving Engagement and Organizational Change June 26, 2025
  - A Framework for Modern Data Governance June 25, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

The Secret Behind the New AI Spring: Transfer Learning

Transfer learning has democratized artificial intelligence. A real-world example shows how.

By Rachel Silver
August 24, 2018

As enterprises strive to find competitive advantages, artificial intelligence stands out as a "new" technology that can bring benefits to their organization. Model building is a big part of AI, but it is a time-consuming chore, so anything an enterprise can do to make faster progress is a plus. That includes finding ways to avoid reinventing the wheel when it comes to building AI models.

For Further Reading:

3 Keys to Maximizing Machine Learning in Your Enterprise

Preparing Your Company for Machine Learning

Artificial Intelligence Starts with Data

Transfer learning allows developers to take a model trained on one problem and retrain it on a new problem, reusing all of the prior work to enhance the precision of the new model without the need for the massive data or compute scale it takes to generate a new model from scratch. This makes the process of building complex models accessible to teams that otherwise lack these resources.

To understand why this need is so urgent, we need a holistic understanding of the costs related to building models.

ML Is Not New, Just Newly Accessible

The boom in machine learning's popularity may have led to the perception that ML is new. It's not -- businesses have been deriving value from ML-driven insights for 30 years. What is new is that there are now many more problems and opportunities that machine learning can address, and it's becoming clear that early adoption of ML can lead to profit advantages across all sectors.

The surge of interest in machine learning has expanded beyond academia. Machine learning has permeated all parts of the data-driven enterprise because two key technical innovations made it more accessible to broadening areas of business and personas: a drastic reduction in the cost of processing and the rise of transfer learning.

Processing at Scale Is Now Affordable

Innovations in GPUs and distributed compute resources have brought machine learning and AI into the realm of the affordable (from $109 per billion floating-point operations per second in 2003 to $0.03 today). In the past, access to a high-performance or supercomputing center was required to get started. Today, data scientists frequently perform initial work on their local computers, cloud nodes, or even commodity GPU hardware.

Once the hardware costs became manageable, the focus turned to the human labor costs of machine learning and artificial intelligence.

The Expensive and Onerous Labeling Problem

To understand why transfer learning was such a revolutionary development, you first need to understand how painful the previous processes were.

Among the most common machine learning use cases are classification problems: you have a set of data and a set of labels, and you want to apply labels to the data -- as in fraud analytics, spam detection, and object recognition. To train the model to recognize these connections in practice, you need a very large set of labeled data. Procuring large amounts of high-quality labeled data is very expensive and domain experts are required to provide useful labels.

To get a sense of what it costs to label a data set, think about trying to label tumors in CT scans. You'd need thousands of images for each type, and the person qualified to label these images is a specially trained radiologist. Assuming a radiologist charges $150/hour and can fully annotate 4 images an hour, and we need about 10,000 images, we're looking at one pass costing $375,000 (and realistically, you'd make two or three passes):

(10,000/4) * $150 = $375,000

The Solution: Transfer Learning

Transfer learning was a revolutionary breakthrough that made it possible to reuse prior work and democratize machine learning models. The first example of transfer learning occurred in 1998, but it has gained attention as a result of the yearly ImageNet Large Scale Visual Recognition Challenge (ILSVRC).

In transfer learning, developers take a base model that has been pre-trained to distinguish between common features using a large quantity of data. Retraining this model on your particular data set is a focused effort that requires a much smaller amount of data and compute resources, making the process accessible to teams that lack the massive resources required to train a precise model from scratch.

Diving In

Here's a real-world example to illustrate how transfer learning can cut time and costs building a model.

I wanted to build a model that could identify a likely dog breed based on a photograph. First, I'd have to build a model that could identify edges and objects, then dogs, and then features of the individual dog breeds.

The ImageNet community provides massive labeled data sets that can be used to train Convolutional Neural Net (CNN) classification models to classify over 10,000 classes of common objects such as "hamster," "schooner," and "strawberry." They used Mechanical Turk at a cost of ~$0.02-0.05 per labeled image, meaning a single pass of the initial batch (1.2 million images) cost around $25,000 in 2010. The full data set now contains 9 million images, meaning it would cost more than $250,000 to reproduce.

Labeling is only one cost incurred. Based on my own experiments, retraining the TensorFlow Inception V3 model on only 20,000 images took over 7 hours on a moderately sized, CPU-based cloud instance. Training this model on the full data set of 9 million images would be prohibitive for a hobbyist or moderately funded team.

Luckily, I don't have to do this process from scratch because the initial effort has already been crowdsourced for me. I was able to take the Inception model, which has been trained on the full ImageNet data set, and retrain this model using the Stanford Dogs Dataset, which contains 20,000 labeled images of dog breeds. This task took about 30 minutes to complete on a GPU-based cloud instance, and the storage required was less than 2 GB.

Transfer learning and the democratization that crowdsourcing the labeling and training provides made it possible for me to get up and running with a pretty sophisticated computer vision model in a few hours. The only costs I incurred were the minimal costs of running a few cloud instances. You can read more about my experiment here.

A Final Word

Transfer learning has democratized machine learning because it allows developers to reuse prior work, meaning that they don't have to reinvent the proverbial wheel in every situation. This lowers the barrier to entry, allowing more teams to explore and experiment, which will lead to innovation.

About the Author

Rachel Silver is the product management lead for machine learning and AI at MapR Technologies. She is passionate about open source technologies and helps to drive ecosystem adoption initiatives. Previously, Rachel was a solutions architect and applications engineer.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

The Secret Behind the New AI Spring: Transfer Learning

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

The Secret Behind the New AI Spring: Transfer Learning

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career