1. Exploiting Curated, Domain-Specific Repositories to Facilitate Globally Interoperable Databases: the GEOROC Use-Case for Global Geochemical Data
- Author
-
Marthe Klöcking, Adrian Sturm, Bärbel Sarbas, Leander Kallas, Stefan Möller-McNett, Jens Nieschulze, Kerstin Lehnert, Kirsten Elger, Wolfram Horstmann, Daniel Kurzawe, Matthias Willbold, and Gerhard Wörner
- Abstract
The GEOROC database is a leading, open-access source of geochemical and isotopic datasets of igneous and metamorphic rocks and minerals. It was established 24 years ago and currently provides access to curated compilations of rock and mineral compositions from >20,600 publications (>32 million single data values). The Digital Geochemical Data Infrastructure (DIGIS) initiative for GEOROC 2.0 is now building a connected platform capable of supporting the diverse demands of digital, data-based geochemical research: including modern solutions to data submission, discovery and access.One of the challenges for maintaining a high quality, up-to-date database such as GEOROC is consistent data entry. Historically, data were compiled manually from the academic literature by trained curators. This manual data entry process is slow, resource-intensive and prone to errors. Exacerbated by the lack of best-practices or standards for analytical geochemical data reporting, the quality and completeness of data and metadata compiled in this way are highly variable. A possible solution to this challenge is offered by domain-specific repositories: in part driven by demands of some funders and publishers to make all research data publicly available, data producers increasingly publish their research datasets, affording repositories a unique opportunity to impose consistent standards and quality. Following these developments, DIGIS established a domain repository with DOI minting capabilities in 2021 to support independent data submission by authors. In principle, these data submissions may comprise new analytical results as well as compilations of previously published data (“expert datasets”). DIGIS also uses its repository for versioning of the GEOROC data compilations and to provide distinct, citable objects to the researchers that use GEOROC compilations for their work (so-called “precompiled files”, a collection of pre-formatted results of the most popular search queries to the GEOROC database are regularly updated and re-published). However, whilst all data submissions by authors are required to fulfill the scope of the GEOROC database, new analytical data need to meet additional quality requirements: the repository enforces a strict template to ensure consistent reporting of all relevant sample and method/analysis metadata. These templates can then be automatically harvested from the repository directly into the GEOROC database, with the added guarantee that new data entries are a) approved by the owners of the datasets, and b) follow a consistent data reporting and quality standard.To encourage user uptake of both the repository and the compilations available in the GEOROC database, DIGIS is working closely with IEDA2 and EarthChem towards developing a common infrastructure for geochemical data. One goal of this collaboration is a single repository submission platform that asserts the same requirements for data and metadata quality of submitted datasets. In addition, DIGIS has also partnered with GFZ Data Services as their trusted domain repository. Finally, through the OneGeochemistry initiative, all three partners are working towards global community-endorsed best practices for geochemical data publication. Ultimately, these efforts will facilitate greater interoperability between globally distributed geochemical data systems, enabling more user-friendly delivery of data publication and compilation services to the research community.
- Published
- 2023