1. Measuring the time spent on data curation.
- Author
-
Perry, Anja and Netscher, Sebastian
- Subjects
DATA curation ,DATA libraries ,WORKING hours ,DATA management ,DATA scrubbing ,DIGITAL preservation - Abstract
Purpose: Budgeting data curation tasks in research projects is difficult. In this paper, we investigate the time spent on data curation, more specifically on cleaning and documenting quantitative data for data sharing. We develop recommendations on cost factors in research data management. Design/methodology/approach: We make use of a pilot study conducted at the GESIS Data Archive for the Social Sciences in Germany between December 2016 and September 2017. During this period, data curators at GESIS - Leibniz Institute for the Social Sciences documented their working hours while cleaning and documenting data from ten quantitative survey studies. We analyse recorded times and discuss with the data curators involved in this work to identify and examine important cost factors in data curation, that is aspects that increase hours spent and factors that lead to a reduction of their work. Findings: We identify two major drivers of time spent on data curation: The size of the data and personal information contained in the data. Learning effects can occur when data are similar, that is when they contain same variables. Important interdependencies exist between individual tasks in data curation and in connection with certain data characteristics. Originality/value: The different tasks of data curation, time spent on them and interdependencies between individual steps in curation have so far not been analysed. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF