Soil organic matter (SOM) is critical for soil fertility, crop growth, and plays an important role in the global carbon cycle and climate change. Therefore, spatial prediction of SOM is important to rational soil resource utilization, agricultural production, and ecological environment management. However, large-area SOM mapping research heavily relies on legacy soil data, and large-scale recent SOM mapping may not be possible or have lower accuracy due to limited or less recent data availability. In this study, we aimed to improve SOM prediction and mapping accuracy by combining legacy data with limited recent data. Three models, namely, partial least squares regression (PLSR), random forest (RF), and one-dimensional convolutional neural network (1D-CNN), were applied and compared. The results showed that combining legacy and recent data effectively improved SOM prediction accuracy compared to using only recent data. Among the three modeling methods, 1D-CNN exhibited superior performance, with an averaged determination coefficient of the prediction (R2) of 0.58, a root mean square error (RMSE) of 4.56 g/kg, and a ratio of performance to interquartile distance (RPIQ) of 2.05. The predicted SOM content for both legacy (1980 s) and recent (2010 s) periods showed similar spatial distribution patterns throughout the Huanghuaihai Plain. Generally, there was a noticeable trend of increasing SOM content from northwest to southeast, with higher values observed in Jiangsu and lower values concentrated in Henan, Hebei, and Shandong regions within the study area. Over time, SOM contents in the Huanghuaihai Plain showed an increasing trend, with an average increase of 5.90 g/kg from legacy to recent period. This study provides a promising approach for improving SOM prediction and mapping accuracy at large scales, particularly when recent data availability is limited.
Read full abstract