Back to Search Start Over

Development of a Semiautomated Database for Patients With Adult Congenital Heart Disease.

Authors :
Verma S
Alkan M
Deligianni F
Anagnostopoulos C
Diller G
Walker L
Johnston FC
Danton M
Walker H
Swan L
Hunter A
McGuire A
Dawes M
Stott S
Lyndsey M
Walker N
Veldtman G
Source :
The Canadian journal of cardiology [Can J Cardiol] 2022 Oct; Vol. 38 (10), pp. 1634-1640. Date of Electronic Publication: 2022 May 31.
Publication Year :
2022

Abstract

Background: Databases for Congenital Heart Disease (CHD) are effective in delivering accessible datasets ready for statistical inference. Data collection hitherto has, however, been labour and time intensive and has required substantial financial support to ensure sustainability. We propose here creation and piloting of a semiautomated technique for data extraction from clinic letters to populate a clinical database.<br />Methods: PDF formatted clinic letters stored in a local folder, through a series of algorithms, underwent data extraction, preprocessing, and analysis. Specific patient information (diagnoses, diagnostic complexity, interventions, arrhythmia, medications, and demographic data) was processed into text files and structured data tables, used to populate a database. A specific data validation schema was predefined to verify and accommodate the information populating the database. Unsupervised learning in the form of a dimensionality reduction technique was used to project data into 2 dimensions and visualize their intrinsic structure in relation to the diagnosis, medication, intervention, and European Society of Cardiology classification lists of disease complexity. Ninety-three randomly selected letters were reviewed manually for accuracy.<br />Results: There were 1409 consecutive outpatient clinic letters used to populate the Scottish Adult Congenital Cardiac Database. Mean patient age was 35.4 years; 47.6% female; with 698 (49.5%) having moderately complex, 369 (26.1%) greatly complex, and 284 (20.1%) mildly complex lesions. Individual diagnoses were successfully extracted in 96.95%, and demographic data were extracted in 100% of letters. Data extraction, database upload, data analysis and visualization took 571 seconds (9.51 minutes). Manual data extraction in the categories of diagnoses, intervention, and medications yielded accuracy of the computer algorithm in 94%, 93%, and 93%, respectively.<br />Conclusions: Semiautomated data extraction from clinic letters into a database can be achieved successfully with a high degree of accuracy and efficiency.<br /> (Copyright © 2022 Canadian Cardiovascular Society. Published by Elsevier Inc. All rights reserved.)

Details

Language :
English
ISSN :
1916-7075
Volume :
38
Issue :
10
Database :
MEDLINE
Journal :
The Canadian journal of cardiology
Publication Type :
Academic Journal
Accession number :
35661703
Full Text :
https://doi.org/10.1016/j.cjca.2022.05.022