Why Data Science Projects Fail — And What To Do Next

By Kris Schroeder, Business Architect & Agilist

As data science becomes necessary for organizations to keep up in the marketplace, delivery teams are encountering a new problem: a lack of standard approaches to delivering data science projects. Teams often default to using practices that are not repeatable or sustainable. This leads to low project maturity, lack of continuous improvement, and inadequate feedback. Without consistent feedback loops, organizations are potentially missing out on delivering valuable insights to customers.

In this first blog of a three-part series, we'll delve into the top five reasons for stalled or unsuccessful data science projects. For the rest of the series, see the links below:

Reason #1: Lacking clear business objectives

One of the biggest issues data scientists face is a lack of clear business objectives. Symptoms that indicate this gap include:

  • Data scientists who are unaware of the outcomes the stakeholders are hoping to achieve
  • Building something too complex for a simple problem
  • Not knowing how customers will interact with the model

Whether the organization has failed to be clear about its objectives or the data scientists are lacking the skills to define the problem and success criteria, the result is an inability to develop a good hypothesis and, therefore, a productive model.

Reason #2: An underdeveloped delivery methodology

An effective delivery methodology is the foundation necessary for successful data science teams. Does your team know if a model is good enough to be used in production? Does your team have a way to operationalize the model in production? If not, those are clear signs that your data delivery methodology lacks maturity. Reliance on a complicated decentralized data pipeline can result from the lack of top-down strategy and sponsorship.

Reason #3: Knowledge gaps around the data science lifecycle

If stakeholders are anticipating a working model after a two-week sprint, then the organization has not been educated on the inherently unpredictable nature of data science work. An organization that is anticipating the same milestones and timelines that they are accustomed to from a software development team is going to be left disappointed due to the iterative and unpredictable nature of the data science lifecycle and inconsistency of outcomes. Additional pain points incurred when some are lacking knowledge of the data science lifecycle include inconsistent use of terminology among data, software, and DevOps engineers, missing dependencies between the data science and platform teams, and having stakeholders who are resistant to making decisions based on model predictions.

Reason #4: No clear strategy for data governance/quality

Lacking data at the start of a project, poorly structured data, inconsistencies in the data, and dirty data that requires cleansing are going to block your ability to deliver working models effectively. The five V’s of data (velocity, volume, value, variety, and veracity) need to be understood or at the very least discussed prior to the start of any data science project.

Reason #5: Data science skills gaps

A shortage of data scientists is impacting everyone in the IT industry. Even if you can fill a team with top-notch statisticians and MLOps engineers, you are likely still lacking the required skill sets for a successful data science project. Business domain knowledge, pipeline expertise, and the soft skills to uncover the business problem to be solved are all critical as well. Maturing your ability to deliver also requires your team to understand agility and the skills necessary to shorten feedback loops.

Gauging impact

Investing time evaluating where your organization may be falling short can be extremely valuable in helping you begin to chart an informed path forward. A mapping exercise like the one below can help unlock new levels of visibility around your own data science projects. It's up to you and your team to identify where your organization has the most opportunity to grow.

Challenges for data science work

In the next two parts of this series, we'll look at how to improve your data science project maturity and discuss data science teams and agility.