1. Data administration shell for data-science-driven development
- Author
-
Michael Weyrich, Nasser Jazdi, Andreas Löcklin, Hannes Vietz, Tamás Ruppert, and Dustin White
- Subjects
Information management ,Computer science ,Information sharing ,Shell (computing) ,Reuse ,computer.software_genre ,Data science ,Field (computer science) ,Documentation ,Scripting language ,General Earth and Planetary Sciences ,computer ,General Environmental Science ,Data administration - Abstract
Data-science-driven development projects are increasingly gaining the attention of small and medium sized enterprises. Since SME are often lacking the necessary competencies in data science, cooperation with other companies or universities is required. The efficient handling of data is one of the main challenges in joint cross-enterprise development projects. Actual cost driver is the development of data by labeling and classifying the data by domain experts, which is very time-consuming and labor-intensive with large amounts of data. Furthermore, clearance processes also have a high potential to cause delays before data can be shared with project partners. Moreover, before the actual work can begin, it is often necessary to clean up and repair incomplete or noisy data. The concept of Data Administration Shell presented in this paper addresses the challenge of structured information sharing and information management in joint cross-enterprise engineering. The Data Administration Shell links data sets to information regarding data origin and already performed analyses including their results and program scripts. Adding relations and documentation facilitates the reuse of data sets for subsequent projects. For this purpose, the Data Administration Shell adapts the concepts serving the information sharing in the research field of manufacturing and Digital Twin. The evaluation of the Data Administration Shell was based on time-series measurement data from a production process optimization scenario. Here, the Data Administration Shell manages the data sets of time series data and facilitates the joint cross-enterprise engineering of data-driven solutions.
- Published
- 2021