The main terrestrial carbon (C) fraction is soil organic carbon (SOC), which has a considerable effect on climate change and greenhouse gas emissions through the absorption and sequestration of carbon dioxide (CO2). This has made SOC assessment very important from both economic and environmental viewpoints. The growing count of soil spectral libraries (SSLs) from regional to global scales has brought a tremendous opportunity for the quantification of SOC through developing spectral-based prediction models. Hence, there is a need to take advantage of big data analytics for spectral data processing. The unique ability of deep learning (DL) techniques to leverage important features of high-dimensional large-scale SSLs has made them top-demanding for more sophisticated modeling. The core objective of the present study was to assess the ability of two different DL algorithms, i.e., one-dimensional convolutional neural network (1DCNN) and fully connected neural network (FCNN) coupled with stacked autoencoder (SAE) feature extraction for SOC prediction based on the data from the land use/cover area frame statistical survey (LUCAS) database. SAE extracted the high-level deep features from the visible–near-infrared–shortwave infrared (Vis–NIR–SWIR) spectra of 11441 soil samples, which were then considered as inputs to the 1DCNN and FCNN models for predicting the SOC content. Both SAE-DL feature-selected models yielded higher accuracy than those the DL developed on the entire spectra and a random forest (RF) model was constructed for comparison. The best prediction was achieved by SAE-1DCNN (R2= 0.78, RMSE = 3.94%, RPD = 4.88, RPIQ = 3.91) followed by 1DCNN (R2= 0.73, RMSE = 5.43%, RPD = 3.67, RPIQ = 2.84) proving the superiority of 1DCNN over FCNN in this study. These results supported the applicability of combined deep features extraction and regression methods for predicting SOC using high dimensional large-scale SSLs.
Read full abstract