Back to Search Start Over

Comparing the Performance of NoSQL Approaches for Managing Archetype-Based Electronic Health Record Data

Authors :
Freire, Sergio Miranda
Teodoro, Douglas
Wei-Kleiner, Fang
Sundvall, Erik
Karlsson, Daniel
Lambrix, Patrick
Freire, Sergio Miranda
Teodoro, Douglas
Wei-Kleiner, Fang
Sundvall, Erik
Karlsson, Daniel
Lambrix, Patrick
Publication Year :
2016

Abstract

This study provides an experimental performance evaluation on population-based queries of NoSQL databases storing archetype-based Electronic Health Record (EHR) data. There are few published studies regarding the performance of persistence mechanisms for systems that use multilevel modelling approaches, especially when the focus is on population-based queries. A healthcare dataset with 4.2 million records stored in a relational database (MySQL) was used to generate XML and JSON documents based on the openEHR reference model. Six datasets with different sizes were created from these documents and imported into three single machine XML databases (BaseX, eXistdb and Berkeley DB XML) and into a distributed NoSQL database system based on the MapReduce approach, Couchbase, deployed in different cluster configurations of 1, 2, 4, 8 and 12 machines. Population-based queries were submitted to those databases and to the original relational database. Database size and query response times are presented. The XML databases were considerably slower and required much more space than Couchbase. Overall, Couchbase had better response times than MySQL, especially for larger datasets. However, Couchbase requires indexing for each differently formulated query and the indexing time increases with the size of the datasets. The performances of the clusters with 2, 4, 8 and 12 nodes were not better than the single node cluster in relation to the query response time, but the indexing time was reduced proportionally to the number of nodes. The tested XML databases had acceptable performance for openEHR-based data in some querying use cases and small datasets, but were generally much slower than Couchbase. Couchbase also outperformed the response times of the relational database, but required more disk space and had a much longer indexing time. Systems like Couchbase are thus interesting research targets for scalable storage and querying of archetype-based EHR data when population-based use c<br />Funding agencies: Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior (CAPES Foundation - Brazil) [4055/11]; Conselho Brasileiro de Desenvolvimento Cientifico e Tecnologico (CNPq) [150916/2013-2]

Details

Database :
OAIster
Notes :
application/pdf, English
Publication Type :
Electronic Resource
Accession number :
edsoai.on1234097661
Document Type :
Electronic Resource
Full Text :
https://doi.org/10.1371.journal.pone.0150069