Back to Search Start Over

Enhanced streamflow prediction using SWAT's influential parameters: a comparative analysis of PCA-MLR and XGBoost models.

Authors :
R, Yamini Priya
R, Manjula
Source :
Earth Science Informatics. Dec2023, Vol. 16 Issue 4, p4053-4076. 24p.
Publication Year :
2023

Abstract

Accurate streamflow estimation and assessing the significant parameters are crucial for effective water resource management. In this research, the SWAT model was used to determine streamflow in the Ponnaiyar River Basin, achieving satisfactory accuracy with NSE and R2 of 0.67, KGE of 0.73, and RMSE of 9.257 during calibration. The correlated parameters were established using Pearson Correlation Analysis from the calibrated SWAT-generated parameters. Streamflow prediction was performed with Principal Component Analysis-Multiple Linear Regression (PCA-MLR) using these correlated parameters, resulting in an accuracy of NSE and R2 = 0.67, KGE = 0.69, and RMSE = 9.577 during training, and NSE and R2 = 0.47, KGE = 0.49, and RMSE = 13.624 during testing. Since PCA-MLR exhibited reduced accuracy during testing, this study proposed the combined Soil and Water Assessment Tool-eXtreme Gradient Boosting (SWAT-XGBoost) model, which outperformed the leading-edge models such as SWAT-Categorical Boosting (SWAT-CatBoost) and SWAT-Light Gradient Boosting Machine (SWAT-LightGBM) while maintaining the same correlated parameters. The SWAT-XGBoost model achieved enhanced accuracy with NSE and R2 = 0.83, KGE = 0.85, and RMSE = 2.226 during training, and NSE and R2 = 0.67, KGE = 0.69, and RMSE = 9.805 during testing. The most influential parameters were determined for accurate streamflow prediction using XGBoost's built-in feature importance. The XGBoost model was developed, considering only these influential parameters among the correlated ones, maintaining the same accuracy during training but exhibiting increased accuracy of NSE and R2 = 0.71, KGE = 0.72, and RMSE = 8.516 during testing. Additionally, SHapley Additive exPlanations (SHAP) impact analysis was conducted on the SWAT-XGBoost model to explain the interactions between these influential parameters. Based on the results of the SHAP impact analysis, an XGBoost model was constructed, incorporating positive impact features, negative impact features, and a combination of both. The XGBoost model, built with combined positive and negative impact features, exhibited superior accuracy during training and testing compared to SWAT-XGBoost, which focused primarily on the most influential parameters. This study provides valuable guidance for researchers and policymakers working with limited data availability using integrated model development techniques to enhance streamflow prediction. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18650473
Volume :
16
Issue :
4
Database :
Academic Search Index
Journal :
Earth Science Informatics
Publication Type :
Academic Journal
Accession number :
174096753
Full Text :
https://doi.org/10.1007/s12145-023-01139-9