Back to Search Start Over

Utilizing machine learning to classify persistent organic pollutants in the serum of pregnant women: a predictive modeling approach.

Authors :
Mahfouz, Maya
Mahfouz, Yara
Harmouche-Karaki, Mireille
Matta, Joseph
Younes, Hassan
Helou, Khalil
Finan, Ramzi
Abi-Tayeh, Georges
Meslimani, Mohamad
Moussa, Ghada
Chahrour, Nada
Osseiran, Camille
Skaiki, Farouk
Narbonne, Jean-François
Source :
Environmental Science & Pollution Research; Aug2024, Vol. 31 Issue 40, p52980-52995, 16p
Publication Year :
2024

Abstract

Polychlorinated biphenyls (PCBs), organochlorine pesticides (OCPs), polychlorinated dibenzo-p-dioxins and polychlorinated dibenzofurans (PCDD/Fs), and per- and poly-fluoroalkyl substances (PFAS) are persistent organic pollutants (POPs) that remain detrimental to critical subpopulations, namely pregnant women. Required tests for biomonitoring are quite expensive. Moreover, statistical models aiming to discover the relationships between pollutants levels and human characteristics have their limitations. Therefore, the objective of this study is to use machine learning predictive models to further examine the pollutants' predictors, while comparing them. Levels of 33 congeners were measured in the serum of 269 pregnant women, from whom data was collected regarding sociodemographic, dietary, environmental, and anthropometric characteristics. Several machine learning algorithms were compared using "Python" for each pollutant: support vector machine (SVM), random forest, XGBoost, and neural networks. The aforementioned characteristics were included in the model as features. Prediction, accuracy, precision, recall, F1-score, area under the ROC curve (AUC), sensitivity, and specificity were retrieved to compare the models between them and among pollutants. The highest performing model for all pollutants was Random Forest. Results showed a moderate to acceptable performance and discriminative power among all POPs, with OCPs' model performing slightly better than all other models. Top related features for each model were also presented using SHAP analysis, detailing the predictors' negative or positive impact on the model. In conclusion, developing such a tool is of major importance in a context of limited financial and research resources. Nevertheless, machine learning models should always be interpreted with caution by exploring all evaluation metrics. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09441344
Volume :
31
Issue :
40
Database :
Complementary Index
Journal :
Environmental Science & Pollution Research
Publication Type :
Academic Journal
Accession number :
179505906
Full Text :
https://doi.org/10.1007/s11356-024-34684-x