TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
  - TDWI Digital Dialogue | AI Governance in Practice: Operationalizing Governance for Enterprise AI
- Webinars
  - Creating an AI-Ready Organization – Results of New TDWI Best Practices Research June 27, 2025
  - Modernize and Govern: Unifying Your Data Strategy July 10, 2025
  - Expert Panel: Best Practices for Modernizing Your Data Environment July 14, 2025
  - Powering Data Science with AI-Driven Tools and Practices July 15, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - TDWI Data Governance Principles and Practices: Managing Data as an Asset June 25, 2025
  - Building Your Company’s Data Governance Roadmap June 25, 2025
  - Data Governance: Driving Engagement and Organizational Change June 26, 2025
  - A Framework for Modern Data Governance June 25, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Why Context, Consistency, and Collaboration are Key to Data Science Success

If you want your data science team to achieve more, make sure your data science meets these three criteria.

By Joshua Poduska
November 19, 2021

Given how quickly the fields of artificial intelligence and machine learning are growing and the resulting opportunities to discover profound insights, best-in-class data science requires more than one scientist on a laptop. Once you have a data science team, the members must work together; there's important information that needs to be shared about data prep, results of prior projects, and the best way to deploy a model.

For Further Reading:

Guided Analytics and the Future of Data Science in the Enterprise

Five Roles for Your Data Science Team

The Future of Machine Learning: Models as APIs

Today, if you want your team to move faster, you need context, consistency, and secure collaboration in data science. In this article, we'll examine each of these requirements.

Context

Model building is an iterative, try-it-and-fail experimental practice, and it is true that often one data scientist at a time performs this work. However, a great deal of institutional knowledge is lost if that data scientist doesn't document, store, and make their work searchable by others.

Further, what of junior or citizen data scientists looking to jump into a project to improve their skills? Both synchronous and asynchronous collaboration rely on context to know more about the data they're looking at, how people have addressed the problem in the past, and how prior work informs the landscape.

The process of documenting projects, models and workflows can feel distracting when faced with the more-immediate need to move a model into production. Leaders need to support a culture of knowledge sharing so the whole company benefits and the data science team can build a foundation of expertise and knowledge.

For example, leaders might consider looking at the insights data scientists contribute to the broader knowledge base as part of their standard review and feedback sessions so that collaboration is recognized as an essential principle at the company. Software systems, workbenches, and best practices can help streamline the process of capturing context that can improve discoverability in the future. Without knowledge management and context, new employees struggle with onboarding, slowing their ability to contribute, and teams spend time re-creating projects instead of adding to previous work, which can slow down the entire enterprise.

Building this foundation of knowledge also reduces key person risk. If someone goes on vacation or leaves a project, other team members have the necessary base from which to jump in and keep that project going.

Consistency

We've already witnessed amazing results from the machine learning (ML) and artificial intelligence (AI) fields. Financial services, health and life sciences, manufacturing - all are going through foundational changes thanks to AI and ML. However, these industries are also heavily regulated and for an AI project to genuinely change such an industry, it needs to be reproducible with a clear audit trail. IT and business leaders need to know there's a consistency to the results that will give them confidence in making the strategic business shifts that AI can facilitate. With so much riding on these projects, data scientists need an infrastructure that will give them full reproducibility from beginning to end, and convince top executives of the project's significance.

As data science teams grow and the variety of tools, training sets and hardware requirements becomes more complex, getting consistent results from older projects can be challenging. Processes and systems for environment management are a must for growing teams. For example, if you're working off your laptop as a data scientist and the data engineer has a different version of a library running on a cloud VM, you may see your model generate different results from one machine to the next. This may occur because open source model-building libraries often change default parameter settings as new best practices become established, which will generate different models when using default settings for two different versions of the library. Collaborators need a consistent way of sharing the exact same software environments.

Retraining and updating data science models is becoming more important as the field matures and grows in relevance. Models evolve over time, and data can start to drift as more information is captured. Thinking of a model as "one and done" is incompatible with a changing business world that brings new pricing models or product offerings.

The key is to recognize that when business changes, the data changes, and the best leaders pay attention to refreshing and retraining their models on an ongoing basis. An inventory of different model versions will help manage changes and measure performance for different models over time -- and those models build on an institution's intellectual property.

Secure Collaboration

We've seen how a foundation of prior knowledge can quickly accelerate new projects and how you need consistent results (or at least trackable results) when solving the complex questions that deliver value for businesses today.

You also need a third component. With the increase in remote work, many enterprises discovered that collaborating in data science is much harder than it was when employees worked shoulder to shoulder. Yes, some core work can be handled by a lone data scientist -- such as prepping the data, researching, and iterating on new models -- but too many leaders have made the mistake of not encouraging collaboration, reducing productivity.

How do you coordinate data scientists, engineers, and experts -- along with IT, operations teams, and executive leadership -- all while keeping your data safe? How do you bring these different perspectives and ideas together, ensuring everyone is working from a single source of truth -- and that this data is secured by enterprise-grade, cloud-based services. Shared documents, emailed grids, public code repositories and internal wikis are all quick and easy ways to share information -- but the easier it is to share information, the easier it is for information to leak out.

Not many people like digging through emails or comparing file versions to ensure they have the right data. Having to rely on a variety of sources just adds unnecessary cognitive load. By using a cloud-based tool, data science professionals can bring enterprise security to data science research and leverage IT best practices.

A Final Word

Seeing how far data science has progressed in the past few years has been amazing. Data scientists are helping companies around the world answer formerly unsolvable questions with confidence. However, as our field matures, it's time to move out of the "flying by the seat of our pants" mode. Digital tools such as software workbenches that provide context, facilitate consistency and enable secure collaboration will help us make data science more useful and more consistent with less effort.

About the Author

Joshua Poduska is the chief data scientist with Domino Data Lab, a data science platform that accelerates the development and deployment of models while enabling best practices like collaboration and reproducibility. He has 18 years of experience in analytics. His work experience includes leading the statistical practice at one of Intel’s largest manufacturing sites, working on smarter cities data science projects with IBM, and leading data science teams and strategy with several big data software companies. Josh has a masters degree in applied statistics from Cornell University. You can reach the author via Twitter or LinkedIn.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Why Context, Consistency, and Collaboration are Key to Data Science Success

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Why Context, Consistency, and Collaboration are Key to Data Science Success

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career