Back to Search Start Over

Diagnosis of Diseases from Medical Check-up Test Reports Using OCR Technology with BoW and AdaBoost algorithms

Authors :
Musa M. Ameen
Wisam A. Qader
Source :
2019 International Engineering Conference (IEC).
Publication Year :
2019
Publisher :
IEEE, 2019.

Abstract

This research introduces an approach to diagnose diseases from medical check-up test reports. The proposed approach is produced from Optical Character Recognition (OCR) technology to convert the hard copy test reports into editable textual data, Bag of Words (BoW) model as feature selection algorithm, Naive Bayes as classification algorithm, and AdaBoost technique to enhance the performance of the Naive Bayes classifier. The performance of the proposed approach is very good in terms of validity and can be used in diagnosing of diseases from medical check-up test reports. The proposed approach is trained on dedicated trained partitions of multiple medical datasets, and then tested on the testing sets partitioned from the original datasets. The proposed algorithm is compared with the Support Vector Machine (SVM), Naive Bayes (NB), Decision Table (DT), and k-Nearest Neighbors (k-NN) classifiers, in which all the algorithms are tested on the same datasets. The proposed algorithm showed higher accuracy than the other four classifiers. So, the proposed approach which is the combination of BoW with AdaBoost technique is used to predict the name of the diseases from the medical check-up test reports. After that, an image as an example of the disease will be presented as well with the name of the disease to the physician and the patient. The image presentation is very important for the patients, because they may not familiar with the medical terms and disease names. Finally, the proposed approach can be used in the medical area because of its good performance and showing validated results after it is tested.

Details

Database :
OpenAIRE
Journal :
2019 International Engineering Conference (IEC)
Accession number :
edsair.doi...........10ba3f79a5a33e6fb9225f68a8e64fbb