Artificial Intelligence: get it from the Cloud, or Develop it Yourself?

All of the big tech companies offer specialized artificial intelligence tools. IBM has Watson, Google offers Dialogflow and Vision, Microsoft has its Cognitive Services, Amazon has Rekognition, and Facebook has Wit.ai. But what can you actually do with these tools, and when can you use them? And when is it better to develop an algorithm yourself? In this article, I will explain which factors you need to take into consideration in order to make the right choice.

By Ivo Fugers, Data Scientist at Ortec

There are countless options for building an AI application. The open-source world offers plenty of software solutions, such as R, Python, or Tensorflow, and the open-source community is constantly upgrading the collection with specialized packages that solve a specific problem. The big tech companies also offer tools that can further support the data science process, such as Azure Databricks or Google Cloud AI. Recently, the standard ‘cognitive’ APIs have joined the crowd: algorithms that are pre-trained for a specific purpose.

Data scientists always use the work of others. The question is however: how far will you go in using other people’s specialized work, and when should you take the reins yourself? The ultimate decision depends on a large number of factors, varying from the final application and the available budget, to your organization’s existing IT landscape. So let’s begin by looking at the ‘cognitive’ APIs. In general, the solutions available as an API can be divided into the following categories:

  • Vision: These are algorithms that can analyze images or videos, including face recognition, object recognition, or optical character (text) recognition. Facebook uses these types of algorithms to automatically tag you in photos, for example.
  • Speech: These are algorithms that can convert text into speech and vice-versa. They are a possible add on for chatbots, for example for use by telephone helpdesks that first identify the subject using speech recognition before transferring the call to a real assistant. Or actually having the chat conversation with the user.
  • Language: These algorithms are used in the automated comprehension of words, language, and conversations. They are essential components of chatbots, search engines, translation programs, and other applications that use natural language processing (NLP).
  • Personality: These algorithms can recognize emotion and sentiment in a conversation, or determine the user’s personality based on their word choice. Such algorithms can be used to support call center employees or personalize marketing campaigns.

Benefits of a standard AI tool

IBM, Amazon, Google and Microsoft offer suites of ready-made AI tools that provide several benefits. You can see these applications as standardized AI engines, which you can use for the eventual application. They are available via the cloud, and are therefore quick to use and easy to scale up. You can also benefit from the ‘AI arms race’ that seems to be waging among the tech giants at the moment. They all want to win the battle for the user, and they devote considerable energy into development, which makes AI systems increasingly more powerful. The applications in the fields of speech-, text-, or facial recognition are now so effective, that it no longer pays to develop them yourself. However, I have noticed that some developments in Dutch are lagging behind. Language and speech in the Dutch language sometimes leave much to be desired, but there is progress.

Disadvantages of a standard AI tool

The disadvantage of AI in the cloud is that they are often organized in highly general terms, and cannot be customized. That has consequences for the flexibility of the final application. Plus, you are dependent on your current IT landscape when choosing an API. For example, do you have good contracts with Microsoft then Azure Cognitive Services may be extra appealing, because the final application will integrate well with your current landscape, and the services will therefore be less expensive. However, that does not necessarily mean that Azure Cognitive Services is by definition the best solution.

An API isn’t quite a solution

The AI algorithm in an API can do the one thing it’s trained to do extremely well, but nothing else. Algorithms are often ascribed miraculous properties, but they almost always disappoint in the end. An API also has to land throughout an application. A tool has to be programmed, an infrastructure has to be organized, and you need data engineering, etc. Building a chatbot using a tech company’s standard APIs takes around an hour. But a chatbot that actually replaces 10% of your customer service would take at least half a year to build. Like other so-called ‘AI’ solutions, the tech company’s APIs cannot think or act themselves. AI isn’t magic, it’s just machine learning mixed with smart programming.

Reasons to choose in-house development

Several factors should be taken into consideration when choosing between a standard AI tool and developing an algorithm yourself. The final application is the most important of these factors. The more specific the application, the more it pays to develop an algorithm yourself. An insurer that wants to classify automotive claims automatically via a photo would do well to develop its own algorithm, for example, because there aren’t any standard ‘automotive damage algorithms’. The insurer could then choose to train existing, general image recognition algorithms using labelled data, such as images of cars with or without damage, but an algorithm developed specifically for that purpose would always perform better, mainly because you would be able to use human ‘deduction’. Another advantage is that you can then sell the application you’ve developed to other parties, or build together with other parties.

The choice for building one yourself is also logical if the final application is closely related to your core business. Booking.com, which develops everything itself, is an excellent example. For the past two years, fifty people have been working on their chatbot. This is a huge investment, but the application deals with the core of the company’s operations, so naturally they want to have full control over it. However, not every company has the same budget at its disposal as Booking.com. Budget is therefore absolutely a factor in the decision-making process, but it pays to realize that developing in-house is not always more expensive than using an existing algorithm. If you expect that the AI application will be used intensively, then it may be significantly cheaper in the long run to develop the algorithm yourself, because the existing tools are pay-for-use. Those costs can stack up, which makes it less attractive to scale up.

Conclusion

The AI tools offered by IBM, Amazon, Google, and Microsoft are advanced. If you are looking for an algorithm for speech- or facial recognition, then the best option is to get one from the cloud. However, it is important to realize that the differences between these tech partners are small. For example, Google is best in converting text images to text, and Amazon is number 1 in recognizing faces, but that could change in just a few months. My advice is therefore to test different APIs in a proof of concept before making a decision. It is also useful to organize your application in such a way that it is easy to switch to another API at a later moment in time. However, existing AI solutions may be too general for your specific goals, or you may expect to make intensive use of the algorithm, in which case in-house development is the right choice.

Avatar

 

Behind the Scenes of a ‘Self-Learning’ Algorithm: the Magic of AI Explained

Artificial Intelligence (AI) is often portrayed as a kind of magic technology that will take over humanity in a fully autonomous and self-learning manner. In reality, however, AI is mainly a combination of machine learning and smart programming, which actually requires a lot of human effort. In this article, I will provide a glimpse of what’s hidden behind the scenes of popular ‘self-learning’ applications.

By Ivo Fugers, data scientist at ORTEC

One of the most well-known fields of AI research is machine learning. Machine learning can perhaps best be explained as a statistical computer model that is able to recognize patterns in data. Machine learning allows AI to ‘learn’ from previous observations. Therefore AI can perform tasks, without explicitly being programmed to perform that task (i.e. Machine Learning can classify the risk of an insurance taker without knowing that person, assuming he/she behaves in way it has observed in previous data). Eventually. Because before it can do so, it needs to go through a detailed training program that requires considerable human input. A human has to accurately define the problem to be solved, outline correct and incorrect answers in advance, label the training data (although this might be automated), and evaluate correct and incorrect actions. In addition, a large part of the work that’s involved in machine learning covers the proper configuration of an algorithm. Each case has its own optimal settings, which demands a lot of testing and research on the part of the Data Scientist. I’ve selected two real-world examples to illustrate this process.

Chatbots

Chatbots are automated – speaking or typing – conversation partners. This form of AI is at the moment frequently used to unburden customer service departments, for example by answering frequently asked questions, or by sorting callers to ensure that they reach the right person directly.

Behind the scenes of a self-learning algorithm the magic of AI explainedChatbots are programmed to recognize patterns in the input they receive. Based on those patterns, they then provide a pre-scripted answer. This already requires quite a bit of human labor, both manual and intellectual. When creating a new chatbot, you have to write out ‘conversation trees’. These include a wide range of input variations (ex.: ‘What is the weather forecast today?’, but also ‘Is it going to be hot today?’ or ‘Is it going to rain today?’), which should lead to the desired output (in this case, the weather forecast). It is no longer necessary to manually enter all of the input variations, because a good notation model allows for recognition of the most important patterns in a number of examples.

Chatbots are programmed to recognize patterns in the input they receive. Based on those patterns, they then provide a pre-scripted answer. This already requires quite a bit of human labor, both manual and intellectual. When creating a new chatbot, you have to write out ‘conversation trees’. These include a wide range of input variations (ex.: ‘What is the weather forecast today?’, but also ‘Is it going to be hot today?’ or ‘Is it going to rain today?’), which should lead to the desired output (in this case, the weather forecast). It is no longer necessary to manually enter all of the input variations, because a good annotation model allows for recognition of the most important patterns in a number of examples. But the response scripts can quickly become very complex: the question ‘Why is Rob not here, is he under the weather?’ will need to activate a completely different script than the weather forecast. A chatbot therefore does not simply react to keywords, but is able to recognize the relationships between different keywords. But that doesn’t mean the chatbot knows which relationships belong to which script: defining and labelling the keywords is pure human work (or ‘drudgery’, as the NRC (Dutch link) recently described it).

The next step is to make chatbots ‘smarter’ as they are used more often. We still need humans to define ‘good’ and ‘bad’ conversations, and to correct the algorithm as to which response to give. That way, the chatbot can ‘learn’ not to make the same mistake again, and advances the pattern that the chatbot recognizes in the input. This may sound like self-learning, but humans are constantly providing the necessary feedback. In addition to the Data Scientist, the end users are also frequently called on to provide that feedback. The Google Assistant is an excellent example. The chatbot recently began speaking Dutch, but it still isn’t very fluent. So in order to improve its proficiency, it regularly asks its users if it has done what they had expected (for example by giving a thumbs up or a thumbs down, see illustration). The more people that provide it with training data, the more accurate it can predict what the desired answer will be for the next question.

Predictive maintenance

Predicting the moment that different parts of a machine need maintenance or replacement can provide enormous improvements in efficiency. This application of machine learning, known as predictive maintenance, is extremely popular in the industrial sector. But before a machine becomes smart enough to tell its operators that it is time to inspect a pump or a bearing, many man-hours of work are required. That starts with collecting data about the variables that can have an effect on the life cycle of machine parts. There are not only countless types of machines (turbines, pumps, centrifuges, coolers), but they all have different motors (gas, electric, diesel), drive shafts, formats, ages, and materials. Moreover, there are also several different indicators for wear, varying from vibrations and temperature to rotation speed or pressure. So creating a solution that collects the right data for the algorithm to use requires a huge amount of human expertise. Collecting these data from different systems, cleaning the data and combining them is usually a very time-consuming process (around 80% of the Data Scientist’s time, according to Forbes). And still, no models have been created, and no insights have been generated!

In order to ensure that the complex predictive maintenance algorithm produces the correct results, we need a different training method than that is used for the chatbot. The chatbot ‘learns’ what it needs to do based on complete examples elaborated by humans: when someone says ‘x’, they want to know ‘y’. In predictive maintenance, so many variables have an influence on the need for maintenance that the algorithm often doesn’t even know what it should look for. It has to sort through the tangle of data to find the strongest indicators of a problem situation. In other words: we tell the algorithm that we want to know ‘y’, but we have only little idea what ‘x’ is, except for the fact that it is hidden in the data. The algorithm eventually telling us what we want to know may seem as if it is ‘self-learning’, but before it can do so, it requires humans again to teach the algorithm the values that belong to a machine that is operating ‘correctly’ or ‘incorrectly’. In other words: the algorithm can only start searching for the ‘x’ after a human tells it what the normal and problem situations are.

Greater value from machine learning

AI applications, such as chatbots and predictive maintenance, have a huge potential and can be a valuable tool for increasing efficiency. But before your company starts pursuing AI applications, it is important to understand that they involve more than just pressing a magic button. The process of producing a prediction from data has to be arranged extremely precisely. That requires expanding your knowledge, and perhaps even changing your operations, but it will certainly involve a lot of human effort. Once you understand that, you start the trajectory with the right expectations, leading to a greater chance of success at creating a machine learning application that is fully attuned to your operations and that delivers the promised added value.

Avatar