1. Machine Learning Models for Evaluating Biological Reactivity Within Molecular Fingerprints of Dissolved Organic Matter Over Time.
- Author
-
Zhao, Chen, Wang, Kai, Jiao, Qianji, Xu, Xinyue, Yi, Yuanbi, Li, Penghui, Merder, Julian, and He, Ding
- Subjects
- *
MACHINE learning , *DNA fingerprinting , *DISSOLVED organic matter , *BODIES of water , *BIOLOGICAL models , *RF values (Chromatography) , *INLAND navigation - Abstract
Reservoirs exert a profound influence on the cycling of dissolved organic matter (DOM) in inland waters by altering flow regimes. Biological incubations can help to disentangle the role that microbial processing plays in the DOM cycling within reservoirs. However, the complex DOM composition poses a great challenge to the analysis of such data. Here we tested if the interpretable machine learning (ML) methodologies can contribute to capturing the relationships between molecular reactivity and composition. We developed time‐specific ML models based on 7‐day and 30‐day incubations to simulate the biogeochemical processes in the Three Gorges Reservoir over shorter and longer water retention periods, respectively. Results showed that the extended water retention time likely allows the successive microbial degradation of molecules, with stochasticity exerting a non‐negligible effect on the molecular composition at the initial stage of the incubation. This study highlights the potential of ML in enhancing our interpretation of DOM dynamics over time. Plain Language Summary: As a comprehensive man‐made infrastructure, reservoirs significantly influence the chemical composition, reactivity, and turnover time of dissolved organic matter (DOM) within inland waters. However, it remains elusive how DOM molecules respond to microbial processing over different time scales. Besides the well‐recognized predictive power of machine learning (ML) methodologies, we delved into the processes of tuning the ML models to acquire additional interpretability. We used an under‐sampling strategy to improve model performance and simultaneously observed the variations in model performance metrics for different biological reactivity pools over incubations with different durations. We find that shorter incubation periods result in a broader range of molecules disappearing, with a greater contribution of stochasticity, while the longer incubation allows the successive biodegradation of oxygen‐poor compounds, with a greater contribution of directed degradation. As a complement to traditional geochemical methods, we unveiled a novel perspective in understanding the DOM dynamics over time using ML. Key Points: Machine learning (ML) models were built to correlate the molecular composition and biological reactivity at the world's largest reservoirShorter incubations result in a broader range of molecules disappearing, with a greater contribution of stochasticityTuning the ML model contributes to yield additional interpretability beyond its well‐recognized predictive power [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF