Abstract

Soil organic carbon (SOC), as the largest terrestrial carbon pool, has the potential to influence climate change and mitigation, and consequently SOC monitoring is important in the frameworks of different international treaties. There is therefore a need for high resolution SOC maps. Machine learning (ML) offers new opportunities to do this due to its capability for data mining of large datasets. The aim of this study, therefore, was to test three commonly used algorithms in digital soil mapping – random forest (RF), boosted regression trees (BRT) and support vector machine for regression (SVR) – on the first German Agricultural Soil Inventory to model agricultural topsoil SOC content. Nested cross-validation was implemented for model evaluation and parameter tuning. Moreover, grid search and differential evolution algorithm were applied to ensure that each algorithm was tuned and optimised suitably. The SOC content of the German Agricultural Soil Inventory was highly variable, ranging from 4 g kg−1 to 480 g kg−1. However, only 4 % of all soils contained more than 87 g kg−1 SOC and were considered organic or degraded organic soils. The results show that SVR provided the best performance with RMSE of 32 g kg−1 when the algorithms were trained on the full dataset. However, the average RMSE of all algorithms decreased by 34 % when mineral and organic soils were modeled separately, with the best result from SVR with RMSE of 21 g kg−1. Model performance is often limited by the size and quality of the available soil dataset for calibration and validation. Therefore, the impact of enlarging the training data was tested by including 1223 data points from the European Land Use/Land Cover Area Frame Survey for agricultural sites in Germany. The model performance was enhanced for maximum 1 % for mineral soils and 2 % for organic soils. Despite the capability of machine learning algorithms in general, and particularly SVR, in modelling SOC on a national scale, the study showed that the most important to improve the model performance was separate modelling of mineral and organic soils.

Highlights

  • Soil organic carbon (SOC) is the largest terrestrial carbon pool (Wang et al, 2020) and plays an essential role in agriculture

  • The results show that support vector machine for regression (SVR) provided the best performance with

  • In order to harmonise the depths of both datasets, these were subdivided into mineral and organic soil classes according to a SOC threshold value of 87.0 g kg-1 considering all soils above this threshold as organic soils comprising peat soils and disturbed and degraded peat soils (Poeplau et al, 2020)

Read more

Summary

Introduction

Soil organic carbon (SOC) is the largest terrestrial carbon pool (Wang et al, 2020) and plays an essential role in agriculture. Its decline is identified as a threat that leads to soil degradation (Castaldi et al, 2019; Poeplau et al, 2020). When considering carbon sequestration, the SOC pool provides the option for climate change mitigation (Meersmans et al, 2012; Ward et al, 2019). SOC monitoring is important in the frameworks of various international treaties such as the European Union Soil Thematic Strategy and the United. Nations Framework Convention on Climate Change (Meersmans et al, 2012; Poeplau et al, 2020).

Objectives
Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call