Back to Search Start Over

Challenges and best practices for digital unstructured data enrichment in health research: A systematic narrative review.

Authors :
Jana Sedlakova
Paola Daniore
Andrea Horn Wintsch
Markus Wolf
Mina Stanikic
Christina Haag
Chloé Sieber
Gerold Schneider
Kaspar Staub
Dominik Alois Ettlin
Oliver Grübner
Fabio Rinaldi
Viktor von Wyl
University of Zurich Digital Society Initiative (UZH-DSI) Health Community
Source :
PLOS Digital Health, Vol 2, Iss 10, p e0000347 (2023)
Publication Year :
2023
Publisher :
Public Library of Science (PLoS), 2023.

Abstract

Digital data play an increasingly important role in advancing health research and care. However, most digital data in healthcare are in an unstructured and often not readily accessible format for research. Unstructured data are often found in a format that lacks standardization and needs significant preprocessing and feature extraction efforts. This poses challenges when combining such data with other data sources to enhance the existing knowledge base, which we refer to as digital unstructured data enrichment. Overcoming these methodological challenges requires significant resources and may limit the ability to fully leverage their potential for advancing health research and, ultimately, prevention, and patient care delivery. While prevalent challenges associated with unstructured data use in health research are widely reported across literature, a comprehensive interdisciplinary summary of such challenges and possible solutions to facilitate their use in combination with structured data sources is missing. In this study, we report findings from a systematic narrative review on the seven most prevalent challenge areas connected with the digital unstructured data enrichment in the fields of cardiology, neurology and mental health, along with possible solutions to address these challenges. Based on these findings, we developed a checklist that follows the standard data flow in health research studies. This checklist aims to provide initial systematic guidance to inform early planning and feasibility assessments for health research studies aiming combining unstructured data with existing data sources. Overall, the generality of reported unstructured data enrichment methods in the studies included in this review call for more systematic reporting of such methods to achieve greater reproducibility in future studies.

Details

Language :
English
ISSN :
27673170
Volume :
2
Issue :
10
Database :
Directory of Open Access Journals
Journal :
PLOS Digital Health
Publication Type :
Academic Journal
Accession number :
edsdoj.0ec8982340aba36a4ca7dbf17b49
Document Type :
article
Full Text :
https://doi.org/10.1371/journal.pdig.0000347&type=printable