1. Uncertainty-guided sampling to improve digital soil maps
- Author
-
Sarah Schönbrodt-Stitt, Karsten Schmidt, Wei Xiang, Thomas Scholten, Alexandre M. J. C. Wadoux, Philipp Goebes, Felix Stumpf, and Thorsten Behrens
- Subjects
010504 meteorology & atmospheric sciences ,Soil test ,Calibration (statistics) ,Spatial uncertainty ,Soil sampling ,Sample (statistics) ,Soil landscape modeling ,01 natural sciences ,Statistics ,Sampling design ,0105 earth and related environmental sciences ,Earth-Surface Processes ,Mathematics ,Soil map ,Random Forest ,Three Gorges Reservoir Area ,Soil prediction improvement ,Sampling (statistics) ,04 agricultural and veterinary sciences ,Random forest ,Bodemgeografie en Landschap ,Digital soil mapping ,Soil Geography and Landscape ,040103 agronomy & agriculture ,0401 agriculture, forestry, and fisheries - Abstract
Highlights • Method and application to improve digital soil maps of silt and clay in China • Within the framework of a DSM approach we derived spatial uncertainties. • Spatial uncertainty is based on randomized decision trees. • Model calibration set is refined by purposive sampling in area of high uncertainty. • Method and map refinement is approved using accuracy and uncertainty measures. Digital soil mapping (DSM) products represent estimates of spatially distributed soil properties. These estimations comprise an element of uncertainty that is not evenly distributed over the area covered by DSM. If we quantify the uncertainty spatially explicit, this information can be used to improve the quality of DSM by optimizing the sampling design. This study follows a DSM approach using a Random Forest regression model, legacy soil samples, and terrain covariates to estimate topsoil silt and clay contents in a small catchment of 4.2 km2 in the Three Gorges Reservoir Area, Central China. We aim (i) to introduce a method to derive spatial uncertainty, and (ii) to improve the initial DSM approaches by additional sampling that is guided by the spatial uncertainty. The proposed uncertainty measure is based on multiple realizations of individual and randomized decision tree models. We used the spatial uncertainty of the initial DSM approaches to stratify the study area and thereby to identify potential sampling areas of high uncertainties. Further, we tested how precisely available legacy samples cover the variability of the covariates within each potential sampling area to define the final sampling area and to apply a purposive sampling design. For the final Random Forest model calibration, we combined the legacy sample set with the additional samples. This uncertainty-driven DSM refinement was evaluated by comparing it to a second approach. In this second approach, the additional samples were replaced by a random sample set of the same size, obtained from the entire study area. For the comparative analysis, external, bootstrap-, and cross-validation was applied. The DSM approach using the uncertainty-driven refinement performed best. The averaged spatial uncertainty was reduced by 31% for silt and by 27% for clay compared to the initial DSM approach. Using external validation, the accuracy increased by the same proportions, while showing an overall accuracy of R2 = 0.59 for silt and R2 = 0.56 for clay.
- Published
- 2017
- Full Text
- View/download PDF