How to Keep Control in Scaling Data Science

Despite huge investments, many organizations fail to organize data science and analytics in a scalable way. The sky-high expectations are not realized, because companies struggle to integrate their data activities into their day-to-day decision-making processes. In this article, I will discuss the three biggest stumbling blocks and explain which organizational changes are necessary in order to structurally extract value from data.

Everyone knows the examples of failed software implementation processes in government agencies and companies. Millions of euros down the drain and years of work wasted for a final result that, ultimately, produces nothing! If we’re not careful, many data science and analytics projects risk meeting the same fate. Large organizations invest heavily in data science and artificial intelligence (AI), but typically do not make it much further than one or two successful projects. Only a handful of organizations actually succeeds in making these data activities scalable and integrating them into their strategy and critical decision-making processes, according to a recent study carried out by McKinsey. However, scalability and integration into daily practice are crucial if you are to create structural value with data and analytics. In a previous article, I explained which path companies should take to develop a Proof of Concept (PoC) into a Minimum Viable Product (MVP) as efficiently as possible. The next step is to achieve scalable, integrated success throughout the organization with data-driven initiatives so that data science and AI applications can become a strategic asset for the organization. The goal is to run subsequent projects increasingly quicker and more efficient and embed them in the organization more easily, by simply learning from previous successful projects. It will allow organizations to apply data science more quickly and with greater impact, by only putting in relatively less effort. In what follows, I will explain common stumbling blocks, as well as which organizational changes must be made to overcome them.

Stumbling block 1: willingness amidst an unorganized mess
According to McKinsey, only a very small portion of organizations has a truly sound approach in place. For the most part, you could say that organizations are still an unorganized mess in terms of data science and AI. Several distinct teams put great enthusiasm into organizing all sorts of initiatives, developing initial trials or venturing to develop a PoC. There is however no standard approach, there are no fixed roles, and there is no clearly defined structure of responsibility for the various sub-areas of the project. Due to this fragmented approach, projects fail as a result of incompetence, they overshoot the objectives and a lot of duplicate work is carried out. You will often see that AI applications become an end in themselves, rather than a means to an end.

Solution: repeatability
The solution for the situation sketched above is creating repeatability. This can be done by safeguarding knowledge, and developing, documenting and sharing standard methods for implementing data science applications. One way to do this is setting up a center of excellence, which supports the business and expands within the organization like an oil slick, up to the point it has reached a level of maturity that allows for some degree of independence. You could compare it to the gradual spread of a virus: you accompany a particular business unit on the journey until they become, in some way, capable of operating independently. This method ensures you do not have to keep reinventing the wheel and makes sure there is always a central team to which everyone in the organization can turn. Within projects, it is best to work with cross-functional teams, consisting of employees involved in the core business, IT specialists, data scientists, and business translators. The latter role, that of business translator, is crucial for safeguarding and integrating your data science applications within the organization and, ultimately, for ensuring its success.

Stumbling block 2: plenty of ideas, no bigger picture
Companies whose analytics efforts can best be described as an unorganized mess often do not have anyone keeping an eye on the bigger picture, even though this can simply involve mapping out all core business processes for which data science may have value. As a result, various teams or units are working on their own data projects, which does not only increase the likelihood of duplication, but also causes many opportunities to be overlooked: perhaps your customer churn model can also be used to predict employee turnover or absenteeism. For multinationals in particular, which generally work with multiple divisions and business units, keeping an eye on the bigger picture is crucial in order to recoup investments and get real, continuous value from data science projects. After all, the bigger picture can also help you focus on those business processes that lead to the most profit or directly contribute to the strategy of the organization. The application of data science is mainly about boosting effectiveness or efficiency, not necessarily about developing the most original or innovative application.

Solution: reusability
Again, the center of excellence, preferably with a Chief Data Officer at its helm to develop a strategy and keep control, has a crucial role to play in this respect. The center of excellence is essential in orchestrating the overall process and linking existing solutions to new challenges. To make sure that no opportunities are overlooked and that the organization can approach them as efficiently as possible, it is important that you are able to generalize problems and solutions, so that they can be reused to solve problems within the same context. Reusability can apply to analytical models and technological architecture, as well as the uniformity of data sources and the transformation processes used to unlock those data. In some cases, you may have to look beyond your own organization and gather the courage to implement existing market solutions. Why would you build something yourself if major players such as Microsoft, IBM or Google already have the perfect solution? In that case, it is crucial that your IT architecture allows for these external solutions to be used and integrated.

Stumbling block 3: day-to-day reality creeps in after the project
Ultimately, the core aim is to allow everyone in the organization to benefit from and actually use what data science has to offer. In that process, McKinsey labels ‘the last mile’ – embedding an application into an organization’s daily practice – the most difficult step. This requires consideration at an early stage, but many data scientists are then mainly busy proving the power of  the data science or AI application, rather than thinking about the future. This process starts as early as the intermediate phase between PoC and MVP (again, see this article), but it continues to grow in importance further down the road. Once a solution has been implemented, there are, after all, no guarantees that employees will actually use it in their day-to-day activities, and that the solution will deliver and continue to deliver the desired results.

Solution: maintainability
In this case, maintainability is the starting point, which requires you to look beyond the technical aspects of the application and to rather look at how people will use the application in practice. Which stakeholders should you involve in order to make sure your solution lands well in your organization? Clear explanation, guidance, and frequent evaluations are paramount. After all, the success of a data science or AI application stands or falls with how it is adopted.  Application maintenance is another key aspect that is often overlooked. Who, for instance, will be in charge of making sure that the model continues to work and analyze the right data? Essentially, this is no different than maintaining software: it is simply necessary if you want to ensure that your results remain reliable and want to keep users from getting annoyed. Maintainability, however, does not mean that organizations should seek to do everything themselves, which, in practice, is usually impossible for various reasons. What it does mean, though, is that you should be in control, whilst working with other teams and external partners on matters that require a more specialist approach.

Conclusion: Don’t lose sight of the bigger picture
There is a clear common thread that runs through my plea for repeatability, reusability, and maintainability: staying in control is crucial in order to make your data science and AI applications scalable. First of all, this means that a data-driven approach and the application of data science and AI must be embraced in your organization’s strategy. Secondly, because it is so easy to lose sight of the bigger picture in an organization with tens of thousands of employees, there should always be a team that knows exactly what is going on in all parts of all on-going data initiatives. This team is also responsible for ensuring that newly developed data applications are actually used within the organization and that they produce the intended results. All data science and AI expertise, skills and models must also be secured in a central location, such as a center of excellence. Setting up a center of excellence and training managers and other employees will require short-term investments, of course, but there is nothing wrong with that, as they will ultimately pay off in a manner that will help you to stay in control of your data-driven organization.



Comments are closed.