Back to Search Start Over

Classification and Explanation of Iron Deficiency Anemia from Complete Blood Count Data Using Machine Learning

Authors :
Siddartha Pullakhandam
Susan McRoy
Source :
BioMedInformatics, Vol 4, Iss 1, Pp 661-672 (2024)
Publication Year :
2024
Publisher :
MDPI AG, 2024.

Abstract

Background: Currently, discriminating Iron Deficiency Anemia (IDA) from other anemia requires an expensive test (serum ferritin). Complete Blood Count (CBC) tests are less costly and more widely available. Machine learning models have not yet been applied to discriminating IDA but do well for similar tasks. Methods: We constructed multiple machine learning methods to classify IDA from CBC data using a US NHANES dataset of over 19,000 instances, calculating accuracy, precision, recall, and precision AUC (PR AUC). We validated the results using an unseen dataset from Kenya, using the same model. We calculated ranked feature importance to explain the global behavior of the model. Results: Our model classifies IDA with a PR AUC of 0.87 and recall/sensitivity of 0.98 and 0.89 for the original dataset and an unseen Kenya dataset, respectively. The explanations indicate that low blood level of hemoglobin, higher age, and higher Red Blood Cell distribution width were most critical. We also found that optimization made only minor changes to the explanations and that the features used remained consistent with professional practice. Conclusions: The overall high performance and consistency of the results suggest that the approach would be acceptable to health professionals and would support enhancements to current automated CBC analyzers.

Details

Language :
English
ISSN :
26737426
Volume :
4
Issue :
1
Database :
Directory of Open Access Journals
Journal :
BioMedInformatics
Publication Type :
Academic Journal
Accession number :
edsdoj.0d403d8244e8f80305468d809452b
Document Type :
article
Full Text :
https://doi.org/10.3390/biomedinformatics4010036