Abstract

An advanced estimation model CARS-PLS (competitive adaptive reweighted sampling-partial least squares) has been developed for feature extraction and soil contamination analysis. However, this method works well for a single mining area and has limited capacity in coping with the variations from multiple sites. In this paper, we present an improved estimation model, CARS-PLS-SVM, to cope with the nonlinear problem in multiple sites with SVM (support vector machines). We selected two study areas located in metal mining and coal mining areas of north China. A total of 65 soil samples were collected. The heavy metal(loid) concentrations (Cr and As) were determined by inductively coupled plasma-mass spectrometry (ICP-MS) and atomic fluorescence spectrometry (AFS), and the visible and near-infrared spectra of the soil samples were measured with an ASD (Analytical Spectral Devices) field spectrometer (350–2500 nm). Samples were divided into calibration set (n = 41) and validation set (m = 24) according to heavy metal(loid) concentration. After different pretreatment methods, the new features extracted from CARS-PLS are used as the input to model the data further with SVMs. The performance of CARS-PLS-SVM was investigated under seven different pretreatment methods. Results showed that using combination of Savitzky-Golay and standard normal transformation pretreatment method achieved the best accuracy for Cr estimation (coefficient of determination for prediction, Rp2 = 0.9705, root-mean-square error for prediction, RMSEP = 5.0253, residual prediction deviation, RPD = 5.9001, ratio of prediction performance to interquartile range, RPIQ = 11.4043) and for As prediction (Rp2 = 0.9483; RMSEP = 1.2024; RPD = 3.0689; RPIQ = 5.3643). Besides, CARS-PLS-SVM is better than partial least squares (PLS) regression under the same pretreatment methods. Compared with three current state-of-the-art models: wavelet transform PLS (WT-PLS), synergy interval PLS (siPLS), and the CARS-PLS model, CARS-PLS-SVM was found to be superior to the other methods for Cr prediction (Rp2 = 0.9705; RMSEP = 5.0253; RPD = 5.9001; RPIQ = 11.4043) and As prediction (Rp2 = 0.9483; RMSEP = 1.2024; RPD = 3.0689; RPIQ = 11.5.3643). The results demonstrate that compared with other linear models, the nonlinear model CARS-PLS-SVM has the highest precision in soil heavy metal(loid) estimation modeling of multiple mining areas by the use of proper spectral feature extraction from the pretreated spectra.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call