Back to Search
Start Over
Automatic categorization of medical documents in Afaan Oromo using ensemble machine learning techniques
- Source :
- Discover Applied Sciences, Vol 6, Iss 11, Pp 1-17 (2024)
- Publication Year :
- 2024
- Publisher :
- Springer, 2024.
-
Abstract
- Abstract Automatic medical document classification using machine learning techniques can enhance the productivity of healthcare services by reducing processing time and cost. This work proposes an ensemble learning approach to develop a model that classifies electronic medical documents in Afaan Oromo. The main tasks in this work are preparing the corpus, pre-processing, training the models, and the classification process. We used the term frequency-inverse document frequency (TF-IDF) and bag of words (BOW) feature extraction methods. An ensemble technique in this work is that it creates multiple individual classifier predictions from naïve Bayes, random forest, SVM, and logistic regression and then combines them to advance a reliable and more accurate classifier. Evaluation measures were employed using accuracy, F1-score, recall, and precision for performance comparison. The efficiency of the proposed method is compared with the two existing boosting approaches, namely gradient boosting and adaboost. The experimental result shows the efficiency of BOW feature extraction over TF-IDF in this work on our dataset. These results also illustrated the effectiveness of the proposed model by scoring 94.81% accuracy and 94.84% F1-score. This work significantly contributes to the technological enhancement of service delivery, managing documents through classification methods, and advancing the data processing systems in healthcare sectors.
Details
- Language :
- English
- ISSN :
- 30049261
- Volume :
- 6
- Issue :
- 11
- Database :
- Directory of Open Access Journals
- Journal :
- Discover Applied Sciences
- Publication Type :
- Academic Journal
- Accession number :
- edsdoj.8c82d1ffb61a42778f005bcb675db911
- Document Type :
- article
- Full Text :
- https://doi.org/10.1007/s42452-024-06307-0