
According to Gartner’s Hype Cycle 2016, the public interest in machine learning has currently reached its absolute high point. Generally, the subject of “artificial intelligence” is continually becoming of greater importance in developing new technologies in a wide variety of applications. As a result, businesses and particularly data scientists (a newly created occupational profile) must step up to the challenges of this new subject.
In light of the above, the event “Artificial intelligence and the future of data scientists” at thaltegos GmbH was devoted to answering questions such as:
- What does the term artificial intelligence mean?
- Which technologies are part of the artificial intelligence field and which approaches result from it within the value-added chain?
- What are the challenges and requirements that businesses and in particular data scientists must face in this context?
In order to be able to answer these questions, it is necessary to clarify the use of terminology within the subject of artificial intelligence (A.I.). It is necessary to differentiate between A.I., the umbrella term, and the areas of machine learning and deep learning. While in general A.I. stands for the mechanical generation of knowledge out of experience, machine learning and deep learning are the technologies currently used to generate this knowledge. In machine learning, sample data is used, based on which machines can recognize patterns and regularities that can then be applied to unknown data. Deep learning is the attempt to simulate the human brain with the help of artificial neuronal networks.
Over the past few years, a multitude of A.I. tools have been developed, encompassing everything from cloud-based to OnPremise solutions: Beginning with the software solutions of the big players (for example IBM Watson, AWS Machine Learning), to programming languages containing A.I. packages (for example R. Statistics, Python), and on to open-source code libraries (for example Tensorflow, H2O.ai). In addition, there are special tools for specific subject areas. The current top providers Microsoft, Amazon and IBM Watson are at the core of this.
These tools work with both unstructured as well as structured data. A.I. makes it possible, in particular, to transform unstructured data in the form of speech, texts or pictures into structured data (tabular form), or to work directly with the data. This means that A.I. can play a role at any point in the data value-added chain. Fundamental data management can be supported, for example, by automated data enrichment and completion. Using this as a basis, A.I. makes it possible to apply advanced analysis methods like text mining, anomaly review, and many more. This ultimately creates the foundation for A.I. tools in the fields of business intelligence and business optimization. Therefore, A.I. based technologies can be used in support along the entire value-added chain, for example in preventative maintenance in production or in the field of personalized and automated marketing.
To guarantee the successful use of A.I. technologies, companies must have the basic background in addition to an internal interdisciplinary project team. This means that the right expertise and the right IT infrastructure are available. A manager with good business understanding or an IT expert can ensure this. Elements such as open corporate culture and responsible awareness regarding data security are furthermore necessary. A data scientist acts as the link between management and IT. With the help of this basis, he can process and use A.I. tools efficiently, presenting his results in an understandable manner.
The data scientist’s spectrum of activities is diverse. In order to fulfill the demands, it is necessary to be knowledgable in a variety of fields. First, it is important to have a general understanding of business in order to be able to specify reasonable goals and consequently determine necessary procedures. Furthermore, a data scientist must be knowledgable in mathematics so that he can create statistical models and solve optimization problems. Skills in the field of computer science are necessary for implementation of these models. This particularly includes the command of programming languages, the use of statistical software, as well as skills in BI-tools needed for reporting.
The evolution of A.I. will make it possible that the data scientist no longer has to implement various models himself. A.I. tools will automatically and intelligently set up and implement the best model, thereby taking over the basic modelling and calculations. With this development, the relevant tool know-how will be sufficient to make good use of these. The data scientist can thus fall back on the best possible models and focus on the value-added interpretation of the results. Furthermore, an understandable communication and visualization of the obtained results, as well as an interpretation thereof, will likely be an important part of his work. In the future it will therefore be the data scientist’s job to recognize the possibilities of A.I. and use them profitably.