1. Multivariate random forest for digital soil mapping.
- Author
-
van der Westhuizen, Stephan, Heuvelink, Gerard B.M., and Hofmeyr, David P.
- Subjects
- *
DIGITAL soil mapping , *RANDOM forest algorithms , *FOREST soils , *STANDARD deviations , *SOIL mapping , *QUANTILE regression - Abstract
In digital soil mapping (DSM), soil maps are usually produced in a univariate manner, that is, each soil map is produced independently and therefore, when multiple soil properties are mapped the underlying dependence structure between these soil properties is ignored. This may lead to inconsistent predictions and simulations. For example, soil organic carbon (SOC) and total nitrogen (TN) maps produced independently may show unrealistic carbon–nitrogen (C:N) ratios. In the last decade the production of soil maps with machine learning models has become increasingly popular as these models are able to capture complex non-linear relationships between soil properties and environmental covariates. However, producing soil maps with multivariate machine learning models is still lacking and requires much investigation in DSM. In this paper we present the combined modelling of multiple soil properties with a multivariate random forest (MRF) model. We applied this model to mapping SOC and TN, and we compared it with results of two separate univariate random forest (RF) models. The comparison was done by means of stochastic simulations determined by sampling from the conditional distributions of the soil properties, given the covariates, as estimated by quantile regression forest. The results show that the MRF model is superior in terms of maintaining the dependence structure between SOC and TN, and consequently, is also able to produce more realistic C:N ratios. The models were also compared on the basis of prediction accuracy using commonly used accuracy metrics such as the root mean square error (RMSE). We found that the accuracy of the MRF model (RMSE-SOC = 40. 04 , RMSE-TN = 2. 26 , RMSE-CN = 3. 58) is comparable to that of the univariate RF models (RMSE-SOC = 39. 76 , RMSE-TN = 2. 26 , RMSE-CN = 3. 65). We performed the same comparisons between a regression co-kriging model and two separate regression kriging models, and made similar conclusions. • Multivariate RF accounts for the correlation structure between soil properties. • Simulations produced with multivariate RF are superior compared to those of RF. • In digital soil assessment it is vital to account for correlations in soil properties. • Multivariate RF C:N ratio simulations outperformed those of regression co-kriging. • Prediction accuracy of multivariate RF is comparable with that of RF. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF