Back to Search Start Over

Machine learning for modeling N 2 O emissions from wastewater treatment plants: Aligning model performance, complexity, and interpretability.

Authors :
Khalil M
AlSayed A
Liu Y
Vanrolleghem PA
Source :
Water research [Water Res] 2023 Oct 15; Vol. 245, pp. 120667. Date of Electronic Publication: 2023 Sep 24.
Publication Year :
2023

Abstract

Nitrous oxide (N <subscript>2</subscript> O) emissions may account for up to 80 % of a wastewater treatment plant's (WWTP) total carbon footprint. Given the complexity of the pathways involved, estimating N <subscript>2</subscript> O emissions through mechanistic models still often fails to precisely depict process dynamics. Alternatively, data-driven methods for predicting N <subscript>2</subscript> O emissions hold substantial potential. However, so far, a comprehensive approach is still overlooked, impeding the advancement of full-scale application. Therefore, this study develops a comprehensive approach for using machine learning to perform online process modeling of N <subscript>2</subscript> O emissions. The approach is tested on a long-term N <subscript>2</subscript> O emission dataset from a full-scale WWTP. Uniquely, the proposed approach emphasizes not just model accuracy, but it also considers model complexity, computational speed, and interpretability, equipping operators with the insights needed for informed corrective actions. Algorithms with varying levels of complexity and interpretability including k-Nearest Neighbors (kNN), decision trees, ensemble learning models, and deep neural networks (DNN) were considered. Furthermore, a parametric multivariate outlier removal method was adjusted to account for data statistical distributions, significantly reducing data loss. By employing an effective feature selection methodology, a trade-off between data acquisition, model performance, and complexity was found, reducing the number of features by 40 % and decreasing data collection cost, model complexity and computational burden without significant effect on modeling accuracy. The best performing models are kNN (R <superscript>2</superscript>  = 0.88), AdaBoost (R <superscript>2</superscript>  = 0.94), and DNN (R <superscript>2</superscript>  = 0.90). Feature importance of models was analyzed and compared with process knowledge to test interpretability, guiding N <subscript>2</subscript> O mitigation decisions.<br />Competing Interests: Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.<br /> (Copyright © 2023. Published by Elsevier Ltd.)

Details

Language :
English
ISSN :
1879-2448
Volume :
245
Database :
MEDLINE
Journal :
Water research
Publication Type :
Academic Journal
Accession number :
37778084
Full Text :
https://doi.org/10.1016/j.watres.2023.120667