Back to Search Start Over

Quick-MIMIC: A Multimodal Data Extraction Pipeline for MIMIC with Parallelization

Authors :
Yutao Dou
Wei Li
Yangtao Zheng
Xiaojun Yao
Huanxiang Liu
Albert Y. Zomaya
Shaoliang Peng
Source :
Big Data Mining and Analytics, Vol 7, Iss 4, Pp 1333-1346 (2024)
Publication Year :
2024
Publisher :
Tsinghua University Press, 2024.

Abstract

Medical big data with artificial intelligence are vital in advancing digital medicine. However, the opaque and non-standardised nature embedded in most medical data extraction is prone to batch effects and has become a significant obstacle to reproducing previous works. This paper aims to develop an easy-to-use time-series multimodal data extraction pipeline, Quick-MIMIC, for standardised data extraction from MIMIC datasets. Our method can fully integrate different data structures into a time-series table, including structured, semi-structured, and unstructured data. We also introduce two additional modules to Quick-MIMIC, a pipeline parallelization method and data analysis methods, for reducing the data extraction time and presenting the characteristics of the extracted data intuitively. The extensive experimental results show that our pipeline can efficiently extract the needed data from the MIMIC dataset and convert it into the correct format for further analytic tasks.

Details

Language :
English
ISSN :
20960654
Volume :
7
Issue :
4
Database :
Directory of Open Access Journals
Journal :
Big Data Mining and Analytics
Publication Type :
Academic Journal
Accession number :
edsdoj.7dfb9423916f418c8347a5a435ec3bfe
Document Type :
article
Full Text :
https://doi.org/10.26599/BDMA.2024.9020024