Back to Search Start Over

Multiple remotely sensed datasets and machine learning models to predict chlorophyll-a concentration in the Nakdong River, South Korea.

Authors :
Lee B
Im JK
Han JW
Kang T
Kim W
Kim M
Lee S
Source :
Environmental science and pollution research international [Environ Sci Pollut Res Int] 2024 Oct; Vol. 31 (48), pp. 58505-58526. Date of Electronic Publication: 2024 Sep 24.
Publication Year :
2024

Abstract

The Nakdong River is a crucial water resource in South Korea, supplying water for various purposes such as potable water, irrigation, and recreation. However, the river is vulnerable to algal blooms due to the inflow of pollutants from multiple points and non-point sources. Monitoring chlorophyll-a (Chl-a) concentrations, a proxy for algal biomass is essential for assessing the trophic status of the river and managing its ecological health. This study aimed to improve the accuracy and reliability of Chl-a estimation in the Nakdong River using machine learning models (MLMs) and simultaneous use of multiple remotely sensed datasets. This study compared the performances of four MLMs: multi-layer perceptron (MLP), support vector machine (SVM), random forest (RF), and eXetreme Gradient Boosting (XGB) using three different input datasets: (1) two remotely sensed datasets (Sentinel-2 and Landsat-8), (2) standalone Sentinel-2, and (3) standalone Landsat-8. The results showed that the MLP model with multiple remotely sensed datasets outperformed other MLMs with 0.43 - 0.86 greater in R <superscript>2</superscript> and 0.36 - 5.88 lower in RMSE. The MLP model demonstrated the highest performance across the range of Chl-a concentrations and predicted peaks above 20 mg/m <superscript>3</superscript> relatively well compared to other models. This was likely due to the capacity of MLP to handle imbalanced datasets. The predictive map of the spatial distribution of Chl-a generated by MLP well captured the areas with high and low Chl-a concentrations. This study pointed out the impacts of imbalanced Chl-a concentration observations (dominated by low Chl-a concentrations) on the performance of MLMs. The data imbalance likely led to MLMs poorly trained for high Chl-a values, producing low prediction accuracy. In conclusion, this study demonstrated the value of multiple remotely sensed datasets in enhancing the accuracy and reliability of Chl-a estimation, mainly when using the MLP model. These findings would provide valuable insights into utilizing MLMs effectively for Chl-a monitoring.<br /> (© 2024. The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.)

Details

Language :
English
ISSN :
1614-7499
Volume :
31
Issue :
48
Database :
MEDLINE
Journal :
Environmental science and pollution research international
Publication Type :
Academic Journal
Accession number :
39316212
Full Text :
https://doi.org/10.1007/s11356-024-35005-y