Prof. Dr. Rudi Studer: Using data efficiently – a challenge for the future


Like on raw materials such as iron ore, water and corn, many modern companies depend on data – data from the logistics chain, on customer be­­haviour, the condition of machinery or on the geography of the regions in which they operate. What is important for the success of companies in such cases is both the low-cost access to the raw material of data and putting it to use, that is the distillation of information from the data and its use in products and in generating services. This article uses the “raw materials microscope” to cast new light on the role of data in organizations, to describe challenges and op­­portunities and to identify how German companies can obtain and use this raw material more effectively in future.

Data as raw material. As an example, let us imagine that a retailer fits the shop shelves with sensors that record the proximity and the identity of customers’ mobile telephones. The data recorded by the sensors are the raw material with which the re­­tailer wishes to improve his or her busi­­ness. As an example of a traditional raw material, let us imagine a tonne of uranium ore. At first glance the difference between these two raw materials could not be greater – and yet, at a second glance, there are indeed both interesting similarities and illustrative differences.

Step one: collection. The similarities begin when the raw material is obtained. In the case of ura­­nium ore, mining is a process which is preceded by the identification of de­­posits requiring the use of specialized equipment. And when raw data is ob­­tained, the identification of potentially valuable data is the first priority. In the above example of the retailer, the aware­­ness that the signals from customers’ mobile telephones supply valuable data on the use of retail space is indeed a creative achievement in it­­self. The next step is to construct an infrastructure for ob­­taining the raw ma­­terial “data”; in this example, this is a physical infrastructure with sensors that need to be installed. In other cases the infrastructure may consist of software programmes that bring data from the corresponding databases together.

In this step, however, a key difference between traditional raw material and data is noticeable.

While traditional raw materials are large­­ly finite, data can be reproduced at will.
However, this potentially infinite availability of data should not hide the fact that control over databases may constitute a position of considerable power, namely when it is very expensive to generate data or if they cannot be easily duplicated. For example, the datasets on social re­­la­­tionships administered by Facebook and similar digital social networks constitute a major competitive advantage that can­­not be easily duplicated by other providers.

Step two: refining. In the next step – in the case of uranium ore – the actual uranium ore must be mined and made into a useful form. In the case of data, the challenge is the same: the data from the sensors must be interpreted and related to other data – such as the retailer’s plan of the shop and information on the placing of products. Only after this step has been taken can the data be used to answer questions such as “How many customers leave this shop without buying anything?” or “How often does a typical customer come into my shop?”.

A common factor to both types of raw material is that they have a wide range of uses – energy generation and weapons-making in the case of uranium and analyzing the interior design of shops, developing buying-behaviour models or real-time optimized advertisements in shops in the case of data. In many cases, the possibilities for use at the moment of extraction are unforeseeable, and preparing data in a form which can be used for a wide range of applications is a challenge faced by many companies today.

A third – in this example, a very clear – common point is security and control. Both uranium and information on customer movements in shops can also be used for negative purposes. Preventing this is a challenge which determines the entire way in which this raw material is collected and used.

Step three: use. In a third step, both types of raw material must be processed in order to make them into profitable products and services. In the case of uranium, this is pri­­marily energy generation. In the case of data, this could be optimizing the operations of a shop with reference to the route shoppers usually take in the shop, or targeted advertising based on the changes customers make to their routes through a shop in the course of a day.

Challenges and chances. Throughout the entire process from col­­lection to use it is noticeable that, in the case of traditional raw materials, this takes place only rarely in a single company. With the raw material data, this is an exception; all the above steps usual­ly take place in a single company. For the future this process must be altered to make data tradable and used efficiently across organizational borders. This challenge is technical as well as organizational and political. On the tech­­nical side, there is the core question of the secure exchange of data in a form such that they also retain their significance in a new context. In this case the techniques of the semantic web developed at the Computing Science Re­­search Centre (FZI) supply an answer. On the organizational and political side, there is a need for legislation and regulations in order to supply a suitable form for the reliable trade in data for all parties involved.

Prof-StuderThe author is a professor at the Institute for Applied Computing Science and For­­mal Description (AIFB) and a director at the Karlsruhe Service Re­­search Institute (KSRI) at the Karlsruhe Institute of Tech­­nology (KIT). He is also a member of the executive board of the Computing Science Research Centre (FZI) and a founding partner of ontoprise GmbH.