Yue Yin, Bingbo Gao, Hao Xu, Yuxue Wang, Dongkai Xie, Yanqing Liu, Chenyi Wang
Abstract
The intricate topography and weak spatial autocorrelation in mountainous areas contribute to strong local and directional heterogeneity in the spatial distribution of soil organic matter (SOM). The relationships between SOM and auxiliary variables also exhibit spatial disparities. This mixed heterogeneity seriously affects the prediction accuracy of SOM's spatial distribution. Furthermore, the high cost and challenges associated with sampling in mountainous areas result in limited availability, sparseness, and uneven spatial distribution of soil samples, thereby intensifying the difficulty of precise spatial prediction. The newly developed two-point machine learning method (TPML) adeptly manages local heterogeneity and heterogeneous relationships by a two-step modeling approach, but its application in addressing directional heterogeneity remains unexplored. This study investigates whether explicitly integrating directional information between two points as an auxiliary variable in the TPML modeling process can enhance the prediction accuracy of SOM in complex terrains characterized by small sample sizes. In this study, multiple sets of comparative experiments were conducted to assess the accuracy of various methodologies, including TPML, ordinary kriging, random forest, and random forest regression kriging. The results indicate that (1) TPML can capture the local and directional heterogeneity in the distribution of SOM in mountainous areas, addressing the spatially varying relationship between SOM and auxiliary variables. (2) TPML demonstrates the capacity to characterize the directional heterogeneity of SOM even without the inclusion of directional information as an auxiliary variable. (3) Through cross-validation, TPML emerges as the most accurate predictive method. Mapping outcomes reveal that TPML can produce precise and coherent spatial distribution maps of SOM with fine spatial details.
Keywords
Soil organic matter; Mountainous areas; Complex terrain; Limited sample size; Spatial distribution prediction; Two-point machine learning
1-s2.0-S1364815225002531-main.pdf