Yen-Ling Chiu,1â 3 Mao-Jhen Jhou,4 Tian-Shyug Lee,4,5 Chi-Jie Lu,4â 6 Ming-Shu Chen7 1Graduate Institue of Medicine and Graduate School of Biomedical Informatics, Yuan Ze University, Taoyuan, 32003, Taiwan, Republic of China; 2Graduate Institute of Clinical Medicine, National Taiwan University College of Medicine, Taipei, 10002, Taiwan, Republic of China; 3Department of Medical Research, Department of Medicine,Far Eastern Memorial Hospital, New Taipei, 22056, Taiwan, Republic of China; 4Graduate Institute of Business Administration, Fu Jen Catholic University, New Taipei, 242062, Taiwan, Republic of China; 5Artificial Intelligence Development Center, Fu Jen Catholic University, New Taipei City, 242062, Taiwan, Republic of China; 6Department of Information Management, Fu Jen Catholic University, New Taipei City, 242062, Taiwan, Republic of China; 7Department of Healthcare Administration,College of Healthcare and Management, Asia Eastern University of Science and Technology, New Taipei, 22061, Taiwan, Republic of ChinaCorrespondence: Chi-Jie Lu Fu Jen Catholic University, New Taipei 242062, Taiwan Email 059099@mail.fju.edu.twMing-Shu Chen Asia Eastern University of Science and Technology, TaiwanAsia Eastern University of Science and Technology, No.58, Sec. 2, Sichuan Rd, Pan-Chiao Dist., New Taipei, 22061, TaiwanEmail tree1013@gmail.comPurpose: As global aging progresses, the health management of chronic diseases has become an important issue of concern to governments. Influenced by the aging of its population and improvements in the medical system and healthcare in general, Taiwanâs population of patients with chronic kidney disease (CKD) has tended to grow year by year, including the incidence of high-risk cases that pose major health hazards to the elderly and middle-aged populations.Methods: This study analyzed the annual health screening data for 65,394 people from 2010 to 2015 sourced from the MJ Group â a major health screening center in Taiwan â including data for 18 risk indicators. We used five prediction model analysis methods, namely, logistic regression (LR) analysis, C5.0 decision tree (C5.0) analysis, stochastic gradient boosting (SGB) analysis, multivariate adaptive regression splines (MARS), and eXtreme gradient boosting (XGboost), with estimated glomerular filtration rate (e-GFR) data to determine G3a, G3b & G4 stage CKD risk factors.Results: The LR analysis (AUC=0.848), SGB analysis (AUC=0.855), and XGboost (AUC=0.858) generated similar classification performance levels and all outperformed the C5.0 and MARS methods. The study results showed that in terms of CKD risk factors, blood urea nitrogen (BUN) and uric acid (UA) were identified as the first and second most important indicators in the models of all five analysis methods, and they were also clinically recognized as the major risk factors. The results for systolic blood pressure (SBP), SGPT, SGOT, and LDL were similar to those of a related study. Interestingly, however, socioeconomic status-related education was found to be the third important indicator in all three of the better performing analysis methods, indicating that it is more important than the other risk indicators of this study, which had different levels of importance according to the different methods.Conclusion: The five prediction model methods can provide high and similar classification performance in this study. Based on the results of this study, it is recommended that education as the socioeconomic status should be an important factor for CKD, as high educational level showed a negative and highly significant correlation with CKD. The findings of this study should also be of value for further discussions and follow-up research.Keywords: chronic kidney disease, health screening, machine learning algorithms, risk indicators assessment, education