TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Modernize and Govern: Unifying Your Data Strategy July 10, 2025
  - Expert Panel: Best Practices for Modernizing Your Data Environment July 14, 2025
  - Powering Data Science with AI-Driven Tools and Practices July 15, 2025
  - Data Integration for AI: Overcoming Modern Pipeline Challenges July 23, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - TDWI Data Governance Principles and Practices: Managing Data as an Asset June 25, 2025
  - Building Your Company’s Data Governance Roadmap June 25, 2025
  - Data Governance: Driving Engagement and Organizational Change June 26, 2025
  - A Framework for Modern Data Governance June 25, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Win at the Game of Business with Reinforcement Learning

Machine learning is often demonstrated by teaching computers to win games. Now is the time to use machine learning to win the game of business.

By Troy Hiltbrand
March 23, 2018

Throughout the history of humankind, we've been fascinated with games. They let us set a target goal and test strategies to see which are superior. Some games are relatively simple in nature with limited winning strategies; others are more complex.

For Further Reading:

To Be Successful with Machine Learning, Expect to Learn As You Go

Monetizing the Digital Consumer Through Data

Myths and Realities of Deep Learning

The game of Go has a near-infinite number of potential paths to success. Developed 2,500 years ago, according to legend it was intended for Chinese aristocracy to test their strategic thinking skills. The rules are relatively simple but the countless different paths to gameplay are what make Go so challenging.

For this reason, Go was a prime test to see whether a machine could be trained to defeat a human player. In 2016, Google's DeepMind proved that a machine can, indeed, be taught to think strategically and win against the world's best Go players. This win was a landmark in artificial intelligence and machine learning because it demonstrated with practical evidence the concepts behind reinforcement learning.

Until recently, most applications in AI and machine learning fell into one of two categories: supervised learning and unsupervised learning. Many remarkable advances have been made with the models and algorithms associated with these two categories, but problems exist that need a different approach.

Reinforcement learning is a new, third category of machine learning that is garnering attention in both academia and business. As a field, reinforcement learning has been around since the 1980s, but we are now starting to see real-world applications of it thanks to the recent explosion of computing power. It has the potential to change business as we know it.

Strategic Thinking and the Customer Journey

Utilizing machine learning to teach computers to win at games is novel and even groundbreaking, but the real value comes into play when we can apply the same technology and concepts to real-world business problems. What made DeepMind's AlphaGo different was that researchers started with limited data and taught the machine much in the same way a new player learns -- by trial and error, getting better with each game.

In the business environment, there are many problems similar in nature to the game of Go. The rules are relatively simple but the number of potential paths to success is seemingly infinite. This is why reinforcement learning has such great transformational potential.

Nowhere is this more critical than in what is popularly called the customer journey. This is the set of steps a customer progresses through when attracted by a business offering to learn more about the company, explore its products, or make a buying decision. Companies such as Amazon have spent considerable time and money optimizing their customer journeys, which has led to dominance in the marketplace.

The customer journey is much like a game where the business is seeking the optimal path for each customer. In this process, companies identify the path for moving strategically through a complex set of choices for how to engage the customer, when to engage the customer, and what to offer the customer at different points to optimize their experience. Optimization of the customer journey with machine learning is a new frontier in which reinforcement learning is a fundamental component.

The Four Components of Reinforcement Learning

To understand reinforcement learning, think of the problem as a series of events that ultimately leads to a reward. Reinforcement learning is about letting the computer test different paths and determine if the results improve. Upon finding a better path, the model is updated to account for this new knowledge.

This series of incremental improvements is what machine learning is all about. Over time, this learning creates a model that is robust enough to choose the correct path with increasing effectiveness but also abstract enough that it can predict the best path even when presented with new information.

There are four basic parts of reinforcement learning: states, actions and transitions, rewards, and policies. These come from the concept behind Markov Decision Process (MDP), which provides a mathematical framework for modeling decision making where outcomes are partly random and partly under the control of a decision maker.

States

A state represents each step in the flow. For a customer journey, this represents where the customer is at any given point in time. This could vary from the customer not having heard of a company and its offering to a loyal customer who has purchased during consecutive months. At each state, the environment for that customer is different and where they can and will go next is different.

Actions and Transitions

An action is the activity that occurs to move a customer from one state to the next. This action could be one that the customer instigates, such as a purchase, communication with customer service, a product return, or a social media post. It could also be an action instigated by the company, such as an email campaign, a promotion, or the fulfillment of an order. Each of these actions moves the customer from one state in the process to another.

Rewards

A positive reward or a negative reward (punishment) is the result of one or a series of transitions. In reinforcement learning, it is the reward that helps the machine learn and evolve over time and optimize the process so it maximizes rewards and minimizes punishments.

One important factor of rewards is the discounted time value of the reward. A reward now is more valuable than a reward in the future. This is considered as the model develops.

At times, you cannot measure the reward at each state and might only be able to see the reward at the end of a series of transitions. A positive reward such as a customer purchase could be the result of multiple email and ad campaigns, each transitioning the customer to a new state. The company does not necessarily know which communication ultimately drove the customer to the business reward, but the ability to track which campaigns had an impact on which customers is important. Working backward from the reward, machine learning can determine which patterns consistently lead to the reward.

Policies A policy is a set of rules that guide the action(s) and transition(s). It is the policy that evolves as the machine garners more information from transitions that result in rewards. The policy is applied to future interactions to automate decisions that will allow a company to win the game. The policy is often used to determine the customer's next-best action or where the company should focus next to improve the likelihood of earning a reward.

Playing to Win

When fully leveraged, this new form of machine learning has great value in improving how businesses operate and interact. The game of business is very similar to other games, but the stakes are usually much higher. The goal is to employ different strategies in the hopes that your strategy is better than that of your competition.

As the science and application of reinforcement learning becomes more popular and more powerful, the companies that master it will dominate their markets and those who don't will slowly disappear.

About the Author

Troy Hiltbrand is the senior vice president of digital product management and analytics at Partner.co where he is responsible for its enterprise analytics and digital product strategy. You can reach the author via email.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Win at the Game of Business with Reinforcement Learning

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Win at the Game of Business with Reinforcement Learning

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career