Abstract

No single instrument can characterize all soil properties because soil is a complex material. With the advancement of technology, laboratories have become equipped with various spectrometers. By fusing output from different spectrometers, better prediction outcomes are expected than using any single spectrometer alone. In this study, model performance from a single spectrometer (visible-near-infrared spectroscopy, vis-NIR or mid-infrared spectroscopy, MIR) was compared to the combined spectrometers (vis-NIR and MIR). We selected a total of 14,594 samples from the Kellogg Soil Survey Laboratory (KSSL) database that had both vis-NIR and MIR spectra along with measurements of sand, clay, total C (TC) content, organic C (OC) content, cation exchange capacity (CEC), and pH. The dataset was randomly split into 75% training (n = 10,946) and the remaining (n = 3,648) as a test set. Prediction models were constructed with partial least squares regression (PLSR) and Cubist tree model. Additionally, we explored the use of a deep learning model, the convolutional neural network (CNN). We investigated various ways to feed spectral data to the CNN, either as one-dimensional (1D) data (as a spectrum) or as two-dimensional (2D) data (as a spectrogram). Compared to the PLSR model, we found that the CNN model provides an average improvement prediction of 33–42% using vis-NIR and 30–43% using MIR spectral data input. The relative accuracy improvement of CNN, when compared to the Cubist regression tree model, ranged between 22 and 36% with vis-NIR and 16–27% with MIR spectral data input. Various methods to fuse the vis-NIR and MIR spectral data were explored. We compared the performance of spectral concatenation (for PLSR and Cubist model), two-channel input method, and outer product analysis (OPA) method (for CNN model). We found that the performance of two-channel 1D CNN model was the best (R2 = 0.95–0.98) followed closely by the OPA with CNN (R2 = 0.93–0.98), Cubist model with spectral concatenation (R2 = 0.91–0.97), two-channel 2D CNN model (R2 = 0.90–0.95) and PLSR with spectral concatenation (R2 = 0.87–0.95). Chemometric analysis of spectroscopy data relied on spectral pre-processing methods: such as spectral trimming, baseline correction, smoothing, and normalization before being fed into the model. CNN achieved higher performance than the PLSR and Cubist model without utilizing the pre-processed spectral data. We also found that the predictions using the CNN model retained similar correlations to the actual values in comparison to other models. By doing sensitivity analysis, we identified the important spectral wavelengths variables used by the CNN model to predict various soil properties. CNN is an effective model for modelling soil properties from a large spectral library.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call