Abstract
This study investigated the abilities of pre-processing, feature selection and machine-learning methods for the spectroscopic diagnosis of soil arsenic contamination. The spectral data were pre-processed by using Savitzky-Golay smoothing, first and second derivatives, multiplicative scatter correction, standard normal variate, and mean centering. Principle component analysis (PCA) and the RELIEF algorithm were used to extract spectral features. Machine-learning methods, including random forests (RF), artificial neural network (ANN), radial basis function- and linear function- based support vector machine (RBF- and LF-SVM) were employed for establishing diagnosis models. The model accuracies were evaluated and compared by using overall accuracies (OAs). The statistical significance of the difference between models was evaluated by using McNemar’s test (Z value). The results showed that the OAs varied with the different combinations of pre-processing, feature selection, and classification methods. Feature selection methods could improve the modeling efficiencies and diagnosis accuracies, and RELIEF often outperformed PCA. The optimal models established by RF (OA = 86%), ANN (OA = 89%), RBF- (OA = 89%) and LF-SVM (OA = 87%) had no statistical difference in diagnosis accuracies (Z < 1.96, p < 0.05). These results indicated that it was feasible to diagnose soil arsenic contamination using reflectance spectroscopy. The appropriate combination of multivariate methods was important to improve diagnosis accuracies.
Highlights
Soil heavy metal contamination demands effective methods for diagnosing suspected contaminated areas and controlling the rehabilitation process
The results showed that the suitable combination pre-processing and feature selection was vital to improve overall accuracies (OAs) of each machine-learning method; (2)
With the combination of pre-processing, feature selection and machine-learning methods, the OAs for soil arsenic contamination diagnosis achieved a satisfactory level (OA > 85%). This result demonstrated that VNIRS could be applied to diagnose soil arsenic contamination, in the process of developing diagnosis models, VNIRS technology depended on conventional methods for providing the ground-truth of soil heavy metal contamination
Summary
Soil heavy metal contamination demands effective methods for diagnosing suspected contaminated areas and controlling the rehabilitation process. There is increasing interest in using visible and near-infrared reflectance spectroscopy (VNIRS, 350–2500 nm) to measure soil heavy metal contents and to map its spatial distribution [1], since this technique provides a non-destructive, rapid, and cost-effective method for measuring several soil properties from a single scan, and requires minimal sample preparation and hazardous chemicals [2]. The spectroscopic measurement of heavy metals is usually feasible because of their indirect relationships with some spectral feature soil properties, such as organic matter, iron-oxides or clays [1]. The spectral features of soil properties in visible/near-infrared spectra are largely overlapping, while other factors, such as surface roughness, moisture content, and organic matter of soil, weaken the spectroscopic measurement of soil properties [3]. The analysis of visible/near-infrared spectra requires the use of multivariate chemometric techniques to mathematically extract useful information for soil property estimations
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.