Back to Search Start Over

Applying XGBoost, neural networks, and oversampling in the undernutrition classification of school-aged children in the Philippines.

Authors :
Yiu, Mark Kevin A. Ong
Pastor, Carlo Gabriel M.
Candano, Gabrielle Jackie C.
Miro, Eden Delight P.
Antonio, Victor Andrew A.
Go, Clark Kendrick C.
Source :
AIP Conference Proceedings; 2024, Vol. 3128 Issue 1, p1-10, 10p
Publication Year :
2024

Abstract

In the Philippines, one in five school-aged children are affected by undernutrition, increasing their risk of physical and cognitive development. The Department of Education (DepEd) attempts to address this issue by targeting children with low body mass index (BMI) for their school-based feeding program (SBFP). However, challenges like inadequate measuring tools and supervision in low-resource communities have led to large discrepancies in the nutritional status of SBFP beneficiaries and non-beneficiaries. Siy Van et al. [1] addresses the difficulties associated with BMI by using machine learning (ML) to predict undernutrition among school-aged children based on socioeconomic and demographic characteristics, dietary diversity scores, and food insecurity scores. Their study compared several ML algorithms and found that their best performing model in terms of accuracy was a random forest (RF) model. However, the RF model had high sensitivity with low specificity, indicating a bias towards the positive class. This study aims to improve these results by employing oversampling techniques and other ML algorithms that were not used in the study. Using the same data set in [1], this study compares four machine learning algorithms (RF, XGBoost, DNN, and NNRF) to predict undernutrition among school-aged children, managing imbalanced data using three oversampling techniques (SMOTE, Borderline-SMOTE, and ADASYN). Eight independent classification tasks for predicting undernutrition were performed, and results showed that a RF-Borderline model performed the best in terms of Cohen's κ (0.3662), with an accuracy of 71.61%, sensitivity of 71.13%, and a specificity of 73.08%. While RF performed the best overall, XGBoost and NNRF performed better than RF on specific tasks. Notably, incorporating oversampling consistently enhanced model performance. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0094243X
Volume :
3128
Issue :
1
Database :
Complementary Index
Journal :
AIP Conference Proceedings
Publication Type :
Conference
Accession number :
178423282
Full Text :
https://doi.org/10.1063/5.0213404