1. Estimating nitrogen and phosphorus concentrations in streams and rivers, within a machine learning framework
- Author
-
Longzhu Shen, Sami Domisch, Giuseppe Amatulli, Tushar Sethi, Peter A. Raymond, Amatulli, Giuseppe [0000-0002-8341-2830], Domisch, Sami [0000-0002-8127-9335], and Apollo - University of Cambridge Repository
- Subjects
Statistics and Probability ,Data descriptor ,Pollution ,Data Descriptor ,010504 meteorology & atmospheric sciences ,media_common.quotation_subject ,chemistry.chemical_element ,STREAMS ,41 Environmental Sciences ,010501 environmental sciences ,Library and Information Sciences ,704/242 ,Machine learning ,computer.software_genre ,01 natural sciences ,Education ,Nutrient ,Element cycles ,lcsh:Science ,ComputingMilieux_MISCELLANEOUS ,0105 earth and related environmental sciences ,media_common ,business.industry ,Phosphorus ,3103 Ecology ,14 Life Below Water ,Nitrogen ,Computer Science Applications ,Random forest ,704/47/4112 ,chemistry ,Environmental science ,lcsh:Q ,Artificial intelligence ,Statistics, Probability and Uncertainty ,Hydrology ,data-descriptor ,business ,Eutrophication ,computer ,31 Biological Sciences ,Information Systems - Abstract
Funder: University of Cambridge, Department of Zoology, Funder: NASA NNX17AI74G, Nitrogen (N) and Phosphorus (P) are essential nutritional elements for life processes in water bodies. However, in excessive quantities, they may represent a significant source of aquatic pollution. Eutrophication has become a widespread issue rising from a chemical nutrient imbalance and is largely attributed to anthropogenic activities. In view of this phenomenon, we present a new geo-dataset to estimate and map the concentrations of N and P in their various chemical forms at a spatial resolution of 30 arc-second (∼1 km) for the conterminous US. The models were built using Random Forest (RF), a machine learning algorithm that regressed the seasonally measured N and P concentrations collected at 62,495 stations across the US streams for the period of 1994–2018 onto a set of 47 in-house built environmental variables that are available at a near-global extent. The seasonal models were validated through internal and external validation procedures and the predictive powers measured by Pearson Coefficients reached approximately 0.66 on average.
- Published
- 2020