Back to Search Start Over

Measuring the time spent on data curation.

Authors :
Perry, Anja
Netscher, Sebastian
Source :
Journal of Documentation; 2022, Vol. 78 Issue 7, p282-304, 23p
Publication Year :
2022

Abstract

Purpose: Budgeting data curation tasks in research projects is difficult. In this paper, we investigate the time spent on data curation, more specifically on cleaning and documenting quantitative data for data sharing. We develop recommendations on cost factors in research data management. Design/methodology/approach: We make use of a pilot study conducted at the GESIS Data Archive for the Social Sciences in Germany between December 2016 and September 2017. During this period, data curators at GESIS - Leibniz Institute for the Social Sciences documented their working hours while cleaning and documenting data from ten quantitative survey studies. We analyse recorded times and discuss with the data curators involved in this work to identify and examine important cost factors in data curation, that is aspects that increase hours spent and factors that lead to a reduction of their work. Findings: We identify two major drivers of time spent on data curation: The size of the data and personal information contained in the data. Learning effects can occur when data are similar, that is when they contain same variables. Important interdependencies exist between individual tasks in data curation and in connection with certain data characteristics. Originality/value: The different tasks of data curation, time spent on them and interdependencies between individual steps in curation have so far not been analysed. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00220418
Volume :
78
Issue :
7
Database :
Complementary Index
Journal :
Journal of Documentation
Publication Type :
Academic Journal
Accession number :
160709414
Full Text :
https://doi.org/10.1108/JD-08-2021-0167