Back to Search
Start Over
Sex-Based Performance Disparities in Machine Learning Algorithms for Cardiac Disease Prediction: Exploratory Study
- Source :
- Journal of Medical Internet Research, Vol 26, p e46936 (2024)
- Publication Year :
- 2024
- Publisher :
- JMIR Publications, 2024.
-
Abstract
- BackgroundThe presence of bias in artificial intelligence has garnered increased attention, with inequities in algorithmic performance being exposed across the fields of criminal justice, education, and welfare services. In health care, the inequitable performance of algorithms across demographic groups may widen health inequalities. ObjectiveHere, we identify and characterize bias in cardiology algorithms, looking specifically at algorithms used in the management of heart failure. MethodsStage 1 involved a literature search of PubMed and Web of Science for key terms relating to cardiac machine learning (ML) algorithms. Papers that built ML models to predict cardiac disease were evaluated for their focus on demographic bias in model performance, and open-source data sets were retained for our investigation. Two open-source data sets were identified: (1) the University of California Irvine Heart Failure data set and (2) the University of California Irvine Coronary Artery Disease data set. We reproduced existing algorithms that have been reported for these data sets, tested them for sex biases in algorithm performance, and assessed a range of remediation techniques for their efficacy in reducing inequities. Particular attention was paid to the false negative rate (FNR), due to the clinical significance of underdiagnosis and missed opportunities for treatment. ResultsIn stage 1, our literature search returned 127 papers, with 60 meeting the criteria for a full review and only 3 papers highlighting sex differences in algorithm performance. In the papers that reported sex, there was a consistent underrepresentation of female patients in the data sets. No papers investigated racial or ethnic differences. In stage 2, we reproduced algorithms reported in the literature, achieving mean accuracies of 84.24% (SD 3.51%) for data set 1 and 85.72% (SD 1.75%) for data set 2 (random forest models). For data set 1, the FNR was significantly higher for female patients in 13 out of 16 experiments, meeting the threshold of statistical significance (–17.81% to –3.37%; P
Details
- Language :
- English
- ISSN :
- 14388871
- Volume :
- 26
- Database :
- Directory of Open Access Journals
- Journal :
- Journal of Medical Internet Research
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.98c10b04643f1be14ae7b5da43bdf
- Document Type :
- article
- Full Text :
- https://doi.org/10.2196/46936