TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Digital Dialogue | Practical Strategies for Data Governance Deployment
  - TDWI Digital Dialogue | Unlocking the Power of AI And Retrieval-Augmented Generation in the Enterprise
  - TDWI Digital Dialogue | Real-Time Digital Intelligence for Generative AI
  - TDWI Digital Dialogue | Developing a Modern Data Strategy for AI: Evolving Roles and Practices
- Webinars
  - Expert Panel: The Power and Benefits of Real-Time Analytics May 12, 2025
  - Maximize the Value of Your Video and Image Content with Metadata-Driven Apps May 15, 2025
  - Breaking Data Silos: Unleashing Global Collaboration with Data Mesh May 22, 2025
  - Your Data’s in the Cloud – Now Make It Work for Your Business June 4, 2025
- Virtual Summits
  - Virtual Events Building a Successful Data and AI Governance Framework May 21, 2025
  - Virtual Events Modern Data Strategy June 25, 2025
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - Is a Data Lakehouse in Your Future?February 11, 2025
  - Teams and Technology: How Your Choices in Technology, Policy, Hiring and Training Can Create a More Effective Analytic Culture February 12, 2025
  - Data Governance Processes: A Framework for Business Success February 12, 2025
  - Modern Data Engineering for Tomorrow's Enterprise Landscape February 12, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Machine Learning and AI with Keith McCormick

From machine learning and transparency to unstructured data and career advice, data scientist Keith McCormick shares his insights on what’s worth paying attention to in the world of AI.

By Upside Staff
May 23, 2024

In the latest Speaking of Data podcast, Keith McCormick, an executive data scientist at Pandata, shared his opinions and recommendations about machine learning, AI, and transparency, along with some career advice. [Editor’s note: Speaker quotations have been edited for length and clarity.]

For Further Reading:

Q&A: An Introduction to Deep Learning

How Ethical AI Is Redefining Data Strategy

Entering the Age of Explainable AI

McCormick explained that when he sits down with clients, he first points out there are a lot of new topics lately and though they're all exciting, applications such as scoring marketing leads, detecting fraud, and detecting anomalies have been going on not just for years but for decades. That continuity is important because you don't have to reinvent the wheel, and you don't have to reconceptualize these problems with the latest techniques. These are familiar use cases.

“I think everybody senses that we're going through something new. Even though I can look back over all these years and see some continuity, there’s clearly something going on that has everybody's attention, and it’s not just hype. How can you separate what can be a little bit dramatic from the reality?

“One of the things that I get a chance to talk about in my courses is how important 2012 was. That’s when there was a big, big event called the ImageNet Competition [a visual-recognition challenge]. It was the third year of the competition and deep learning was a big deal because it thoroughly beat the previous techniques. Is that a chair, a bicycle, a cat, a dog, or a hot dog? I'm not sure how important the hot dog was to ImageNet, but it's certainly important in pop culture, that you can correctly identify hot dogs.” This sparked a great deal of excitement about deep learning at the time.

When McCormick talks to clients about their problems today, “there’s this level of excitement. I’ll be approached and told ‘My boss has asked me to find a way to use large language models in our organization. We don't know what we want to do, but our boss really wants us to do something.’ That's not how you're supposed to start the conversation. You're supposed to start the conversation with ‘We have problem X, and given your experience, what would be the best way to tackle it?’ It might, indeed, be a large language model, but you must start with the problem. You can't start with ‘Wow, we really wish we had a use case for large language models, because everybody seems to be using them, and we don't want to be left behind.’”

Structured and Unstructured Data

Deep learning has made enormous strides, as have chatbots. These technologies share one thing in common: they work with unstructured data. “The stuff I've been doing since the 1990s is all structured. It's one row per insurance claim, one row per customer. That hasn't gone away, so that's why I sometimes sit down with a client and conclude that old school -- or what I call traditional -- machine learning techniques absolutely fit the bill. If I'm talking to somebody about security in a museum or drone delivery -- that's not structured data. How are you going to get the information that it takes to fly under a bridge to see where a crack might be? How does that fit in your Excel spreadsheet? It doesn't. We're talking video that has to be annotated.”

McCormick thinks it's fair to say that “there are probably more organizations that should be focused on their structured data. Nonetheless, we're heading in a direction where everybody's going to have a mix of use cases, and it's probably going to be in different teams. Just like we have BI and data science, there's going to be a day in the not-too-distant future where we have an AI team and a traditional machine learning team.”

Responsible AI and Transparency

The newer techniques McCormick was talking about produce complicated models, and responsible AI involves many things. “Certainly, there's ethics involved, and the potential for bias, whether you're talking about favorable rates on a mortgage or an insurance policy. However, I think most people have read quite a bit about that aspect of it. Most fundamental to responsible AI is model transparency, because if the model is opaque -- the so-called black box models -- you really can't avoid such things as bias.”

There are other problems with a lack of transparency. Without transparency, it's hard to know when the models make mistakes and why they're making them. There are hallucinations to consider. Thought leaders say that we really don't fully understand how the big foundation models work. That’s why when McCormick works with clients, he seeks that transparency, even when the company is not required to have transparent models. “In healthcare, transparency might be a condition of the project, but for many clients, who don't have some regulation forcing them to have model transparency, it’s still important to think about model transparency.

“There is a set of techniques called explainable AI where you can try to pull out of the model reasons why a particular prediction was made. When customers apply to refinance their mortgage and are denied, they may be given a reason code they can look up on the web. That's an example of explainable AI that's been around for a long time. Within the last five years -- it really has been that recent -- there's been an explosion in these explainable AI techniques.

“Deep learning is always opaque. Deep learning is the engine and explainable AI is the caboose -- it's getting pulled right along. More people need these explanations because they're building complex models. It's just a matter of time before there are more regulations around explainable AI.”

Career Advice

When asked what advice he has for aspiring data scientists regarding machine learning, McCormick said that many newly minted data scientists (or people thinking about a career change) might be surprised by his list because his advice is about the most foundational things.

First, he recommends aspiring data scientists understand linear regression. “It sounds so old school, but if you don't really understand linear regression -- thoroughly understand it -- you can't truly understand what neural nets do and why neural nets are able to figure out things without a lot of human help.”

Also on his list: understanding decision trees. “A bit mundane or old school,” he admits, but he still teaches the topic because “decision trees are still useful in their own right. Even if someone is skeptical and says they want to use something a little bit fancier and beef up their portfolio, I want to explain that you can't understand random forest and XGBoost, which are two of the most powerful contemporary algorithms out there, without understanding decision trees.”

Concluding his advice: a topic he says doesn't get enough attention. “Know the machine learning life cycle. I'm a big fan of the cross-industry standard process for data mining (CRISP-DM). Even if you're alone on a team, you have to manage the project, and if you're running a team and there are two or three data scientists working together, you must have a structure to go through this journey. Regression, decision trees, and machine learning life cycle are key.”

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Machine Learning and AI with Keith McCormick

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Machine Learning and AI with Keith McCormick

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career