Back to Search Start Over

Machine learning model (RG-DMML) and ensemble algorithm for prediction of students’ retention and graduation in education

Authors :
Kingsley Okoye
Julius T. Nganji
Jose Escamilla
Samira Hosseini
Source :
Computers and Education: Artificial Intelligence, Vol 6, Iss , Pp 100205- (2024)
Publication Year :
2024
Publisher :
Elsevier, 2024.

Abstract

Automated prediction of students' retention and graduation in education using advanced analytical methods such as artificial intelligence (AI), has recently attracted the attention of educators, both in theory and in practice. Whereas invaluable insights and theories for measuring and testing the topic have been proposed, most of the existing methods do not technically highlight the non-trivial factors behind the renowned challenges and attrition. To this effect, by making use of two categories of data collected in a higher education setting about students (i) retention (n = 52262) and (ii) graduation (n = 53639); this study proposes a machine learning model - RG-DMML (retention and graduation data mining and machine learning) and ensemble algorithm for prediction of students' retention and graduation status in education. This was done by training and testing key features that are technically deemed suitable for measuring the constructs (retention and graduation), such as (i) the Average grade of the previous high school, and (ii) the Entry/admission score. The proposed model (RG-DMML) is designed based on the cross industry standard process for data mining (CRISP-DM) methodology, implemented using supervised machine learning technique such as K-Nearest Neighbor (KNN), and validated using the k-fold cross-validation method. The results show that the executed model and algorithm based on the Bagging method and 10-fold cross-validation are efficient and effective for predicting the student's retention and graduation status, with Precision (retention = 0.909, graduation = 0.822), Recall (retention = 1.000, graduation = 0.957), Accuracy (retention = 0.909, graduation = 0.817), F1-Score (retention = 0.952, graduation = 0.885) showing significant high accuracy levels or performance rate, and low Error-rate (retention = 0.090, graduation = 0.182), respectively. In addition, by considering the individual features selected through the Wrapper method in predicting the outputs, the proposed model proved more effective for predicting the students' retention status in comparison to the graduation data. The implications of the models' output and factors that impact the effective prediction or identification of at-risk students, e.g., for timely intervention, counselling, decision-making, and sustainable educational practice are empirically discussed in the study.

Details

Language :
English
ISSN :
2666920X
Volume :
6
Issue :
100205-
Database :
Directory of Open Access Journals
Journal :
Computers and Education: Artificial Intelligence
Publication Type :
Academic Journal
Accession number :
edsdoj.4220eb66107a40dca64a3babad31111f
Document Type :
article
Full Text :
https://doi.org/10.1016/j.caeai.2024.100205