Done with Failing Data Projects? Get Beyond that Pilot Phase!
Almost all top executives of the world’s biggest companies (99 percent) have data-driven ambitions, but only one third(!) indicated to be succesful in realizing them. So why do so many data projects fail? Pilot projects spring up like mushrooms, but are rarely taken into full production. In my view, poorly organized data projects are the reason for not getting beyond that pilot phase. In this article, I will explain what you should be focusing on in order for your data science projects to succeed.
The massive discrepancy between the number of data projects companies start and the number of projects that is eventually successful, is perfectly illustrated in the graph above. In fact, this graph is even a bit too optimistic, since according to figures by Gartner, up to 85 percent of data projects fail. And, even more disturbing, a recent study by PWC indicated that only four percent of the thousands of companies surveyed had successfully implemented an AI solution.
In my experience: when it goes wrong, it’s almost always a poorly organized data science project that causes the failure. As such, it usually goes wrong during the crucial phase between the pilot project and the implementation of a sustainable solution. The path (see illustration below) from good idea to brilliant software solution has several interim steps that simply can’t be skipped. Since every step requires different skills to pass through successfully, you’ll have to differentiate between the various interim steps in the project: first the Proof of Concept (PoC), then the step to initial implementation; a Minimum Viable Product, or – in short – MVP.
This is what you want to achieve with a Proof of Concept
The Proof of Concept helps you to answer the question: can we put this idea into practice? Is it realistic? Do we have all data at hand to have – for example – a chatbot answer complex questions? To proof this, a small team – consisting of at least a data analyst and a data scientist – works for a limited period of time to verify the feasibility of the idea.
To be able to do that, the criteria for the PoC’s success need to be defined in advance. During this phase there shouldn’t also be too many cooks spoiling the broth: having too many stakeholders involved could push the PoC in the direction of an MVP, usually with disastrous consequences since a PoC can also fail. The quality of the data could – for example – leave much to be desired, there might simply be too little data available, the hypothesized model doesn’t seem to fit the purpose, or the results are simply too mediocre to justify the continuation of the project. In that case, it’s better to find out during the PoC phase, and not at a later stage, so you know what to improve before you start implementing it. After all, investments to improve the model are wasted money if the problem is the result of inferior data quality. And vice-versa; investments in data quality are pointless if the model doesn’t seem to fit.
How to move on to an MVP
The step towards the next phase is crucial: now that you’ve tested the concept, you want to see it in production. This will change the way you approach the project. In a PoC, you can afford to ignore reality to a certain extent. Since you only want to know if the idea is feasible, you don’t yet have to worry about secondary matters. But in pursuing a succesful MVP, the situation is entirely reversed: now you have to put the concept into practice, which means you have to consider the business processes, GDPR, security, and the connection with existing systems and infrastructures. For example, how will the chatbot integrate with the website?
Just as important, the MVP phase requires a completely different set of skills than the PoC phase. As I wrote in a previous article with tips to increase the chance of a data science project’s success: you need more than just an understanding of data science; business sense is at least as important, and only by adding a healthy dose of IT knowledge you will be able to move mountains. That’s precisely where many data science startups come up short, and is exactly the reason why so many pilot projects never get beyond that PoC phase. And even if you were trying to save the project in this phase, for example by contracting one of the big consultancy firms, it’s going to be challenging. Since it won’t be long before you notice that they lack the data science skills needed.
These are the roles you’ll need
The phase that follows a successful PoC will present new roles for business translators, software engineers, data engineers, data architects, and data scientists (see illustration below). But other departments will also need to be drawn into the project, such as legal, compliance, and IT. The trick is to make sure that happens at the exact right moment. That will differ for each organization and project but choosing the wrong moment will always lead to the exact same consequences. If you bring them in too early, it will slow down the project, but if you do it too late, then the departments might feel like they’ve been left out, and they could try to block it. The length of the phase between PoC and MVP largely depends on how familiar the organization is with these kinds of projects.
After a succesful PoC, it is also essential that a role is assigned to a business translator – a role that is becoming increasingly common. This jack-of-all-trades needs to understand what data science can do, but he or she must also be able to talk with the business, and subsequently translate it to IT- and software development.
Time for the real work
You’ve finally reached the point: the first version of the data project is live! The chatbot is on the website, even though it might not yet be able to answer all questions. But don’t worry; that will come later. The good news is: now that the MVP is a success, you can get started with the real work. Because the real challenge will be to make it a scalable solution in your organization – but that’s a topic for my next article.