1. A Data-Centric Approach to Investigate the Feasibility of Utilizing Animal Medical Data as a Solution for Human Medical Data Scarcity
- Author
-
Rabiah A. Al-Qudah and Ching Y. Suen
- Subjects
Automated blood smear analysis ,computer aided diagnosis ,deep learning ,reticulocyte ,data scarcity ,data centric artificial intelligence ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Reticulocyte count is a routine blood test that can be an essential source of knowledge for medical doctors to diagnose and assess patients’ health condition. In fact, the automation of this blood test will reduce cost and time, in addition to protecting laboratorians’ lives, especially during pandemics and outbreaks. However, human reticulocyte data scarcity is a main challenge that slows the pace of the test automation. In this paper, a novel method that assesses the feasibility of using animal reticulocyte cells as a solution to compensate for the scarcity of human reticulocyte data is investigated. The integration of animal cells will be implemented by utilizing a data-centric artificial intelligence approach, in addition to employing multiple deep classifiers that utilize transfer learning in different experimental setups in a procedure that mimics the protocol followed in experimental medical labs. Moreover, to evaluate the effectiveness of the proposed method, three evaluation criteria have been proposed, namely, the pretraining boost, the dataset similarity boost, and the dataset size boost measures. All the experiments of this work were conducted on a public human reticulocyte dataset and the best performing model achieved 98.9%, 98.9%, 98.6% average accuracy, average macro precision, and average macro F-score respectively. Moreover, the results showed that using animals medical data holds a promising solution for human medical data scarcity, as utilizing weights that were pretrained on a medium size feline reticulocyte dataset outperformed the model that utilized weights that were pretrained on the large scale ImageNet dataset
- Published
- 2024
- Full Text
- View/download PDF