Back to Search Start Over

Predicting blood lead in Uruguayan children: Individual- vs neighborhood-level ensemble learners.

Authors :
Seth Frndak
Elena I Queirolo
Nelly Mañay
Guan Yu
Zia Ahmed
Gabriel Barg
Craig Colder
Katarzyna Kordas
Source :
PLOS Global Public Health, Vol 4, Iss 9, p e0003607 (2024)
Publication Year :
2024
Publisher :
Public Library of Science (PLoS), 2024.

Abstract

Predicting childhood blood lead levels (BLLs) has had mixed success, and it is unclear if individual- or neighborhood-level variables are most predictive. An ensemble machine learning (ML) approach to identify the most relevant predictors of BLL ≥2μg/dL in urban children was implemented. A cross-sectional sample of 603 children (~7 years of age) recruited between 2009-2019 from Montevideo, Uruguay participated in the study. 77 individual- and 32 neighborhood-level variables were used to predict BLLs ≥2μg/dL. Three ensemble learners were created: one with individual-level predictors (Ensemble-I), one with neighborhood-level predictors (Ensemble-N), and one with both (Ensemble-All). Each ensemble learner comprised four base classifiers with 50% training, 25% validation, and 25% test datasets. Predictive performance of the three ensemble models was compared using area under the curve (AUC) for the receiver operating characteristic (ROC), precision, sensitivity, and specificity on the test dataset. Ensemble-I (AUC: 0.75, precision: 0.56, sensitivity: 0.79, specificity: 0.65) performed similarly to Ensemble-All (AUC: 0.75, precision: 0.63, sensitivity: 0.79, specificity: 0.69). Ensemble-N (AUC: 0.51, precision: 0.0, sensitivity: 0.0, specificity: 0.50) severely underperformed. Year of enrollment was most important in Ensemble-I and Ensemble-All, followed by household water Pb. Three neighborhood-level variables were among the top 10 important predictors in Ensemble-All (density of bus routes, dwellings with stream/other water source and distance to nearest river). The individual-level only model performed best, although precision was improved when both neighborhood and individual-level variables were included. Future predictive models of lead exposure should consider proximal predictors (i.e., household characteristics).

Details

Language :
English
ISSN :
27673375
Volume :
4
Issue :
9
Database :
Directory of Open Access Journals
Journal :
PLOS Global Public Health
Publication Type :
Academic Journal
Accession number :
edsdoj.378a6154ca984f7f8b07dd5008091350
Document Type :
article
Full Text :
https://doi.org/10.1371/journal.pgph.0003607