Back to Search Start Over

Spatial predictions of groundwater potential using automated machine learning (AutoML): a comparative study of feature selection and training sample size in Qinghai Province, China.

Authors :
Wang, Zitao
Wang, Jianping
Li, Mengling
Source :
Environmental Science & Pollution Research; Jan2024, Vol. 31 Issue 1, p1127-1145, 19p
Publication Year :
2024

Abstract

Predicting groundwater potential is crucial for identifying the spatial distribution of groundwater in a region. It serves as an essential guide for the development, utilization, and protection of groundwater resources. Previous studies have primarily emphasized finding the most accurate prediction model for groundwater potential while giving less attention to the selection of training features and sample sizes. This study aims to predict groundwater potential within Qinghai Province using automated machine learning technology and assess the influence of sample sizes and feature selection on prediction accuracy. Sixteen groundwater conditioning factors were categorized into categorical and numerical variables. Four feature selection modes were utilized as input in training the model. The results indicated that, except for correlations between evaporation and landforms (− 0.8) and precipitation and normalized difference vegetation index (0.8), the Pearson correlation coefficients among the remaining sixteen factors were ≤ 0.5 or ≥ − 0.5. The models XGB-ALL, RF-Entropy, ET-CRITIC, and XGB-PCA yielded accuracy scores of 0.783, 0.685, 0.745, and 0.703, and area under curve (AUC) of 0.819, 0.724, 0.779, and 0.747, respectively. If enough samples are available with the tree model, an increased number of features can improve prediction accuracy. The principal component analysis method showed difficulty in reducing the dimensionality of the input space, while the Entropy method proved efficient. The accuracy and AUC value of the prediction model improved with an increasing number of samples. Training with 8 features and 200 data points achieved an accuracy of 0.745, sufficient to evaluate regional groundwater potential. As for training with 600 samples, the model's performance accuracy rose to 0.9, enabling precise groundwater potential prediction. The outputs of this research can provide decision-makers in groundwater resource management in Qinghai Province with crucial theoretical and practical support. The lessons learned can have future applications in similar situations. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09441344
Volume :
31
Issue :
1
Database :
Complementary Index
Journal :
Environmental Science & Pollution Research
Publication Type :
Academic Journal
Accession number :
174797860
Full Text :
https://doi.org/10.1007/s11356-023-31262-5