Back to Search Start Over

Automatic categorization of medical documents in Afaan Oromo using ensemble machine learning techniques

Authors :
Etana Fikadu Dinsa
Mrinal Das
Teklu Urgessa Abebe
Krishnaraj Ramaswamy
Source :
Discover Applied Sciences, Vol 6, Iss 11, Pp 1-17 (2024)
Publication Year :
2024
Publisher :
Springer, 2024.

Abstract

Abstract Automatic medical document classification using machine learning techniques can enhance the productivity of healthcare services by reducing processing time and cost. This work proposes an ensemble learning approach to develop a model that classifies electronic medical documents in Afaan Oromo. The main tasks in this work are preparing the corpus, pre-processing, training the models, and the classification process. We used the term frequency-inverse document frequency (TF-IDF) and bag of words (BOW) feature extraction methods. An ensemble technique in this work is that it creates multiple individual classifier predictions from naïve Bayes, random forest, SVM, and logistic regression and then combines them to advance a reliable and more accurate classifier. Evaluation measures were employed using accuracy, F1-score, recall, and precision for performance comparison. The efficiency of the proposed method is compared with the two existing boosting approaches, namely gradient boosting and adaboost. The experimental result shows the efficiency of BOW feature extraction over TF-IDF in this work on our dataset. These results also illustrated the effectiveness of the proposed model by scoring 94.81% accuracy and 94.84% F1-score. This work significantly contributes to the technological enhancement of service delivery, managing documents through classification methods, and advancing the data processing systems in healthcare sectors.

Details

Language :
English
ISSN :
30049261
Volume :
6
Issue :
11
Database :
Directory of Open Access Journals
Journal :
Discover Applied Sciences
Publication Type :
Academic Journal
Accession number :
edsdoj.8c82d1ffb61a42778f005bcb675db911
Document Type :
article
Full Text :
https://doi.org/10.1007/s42452-024-06307-0