The extensive accumulation of tailings can potentially cause heavy metal contamination in the surrounding farmland soil. Accurately predicting the spatial distribution of heavy metals in farmland soil is crucial for assessing the potential environmental hazards of tailings.This study focuses on the spatial distribution and the quantitative prediction of heavy metals (chromium (Cr), vanadium (V), and copper (Cu)) in soils surrounding mine tailings using advanced spectral data analysis and multiple prediction models. The original hyperspectral reflectance data were processed using first-order differential (FD), second-order differential (SD), reciprocal logarithmic (LR), and continuum removal (CR) transformations to highlight the positions of characteristic bands. Multiple linear regression (MLR), stepwise linear regression (SLR), partial least squares regression (PLSR), random forest (RF), and back propagation artificial neural network (BP-ANN) models were used to establish inversion models for Cr, V, and Cu based on bands with high correlation coefficients. The performance of the inversion models was evaluated using the coefficient of determination (R2), root mean square error (RMSE), mean absolute error (MAE), and residual predictive deviation (RPD). The results indicate that the raw hyperspectral data from the measured soil exhibit a weak response to heavy metal content in the study area. However, applying FD, SD, and CR transformations significantly enhances the sensitivity of soil spectral data to heavy metal concentrations, facilitating subsequent modeling. Among these, the SD transformation is particularly beneficial for modeling the Cr and Cu elements in the soil. For the V element, the FD transformation yields data that are more suitable for modeling. Regarding the inversion models based on the measured spectral data, the BP-ANN model exhibited the best predictive performance. Specifically, when combined with SD spectral data, the BP-ANN achieved the highest predictive accuracy for Cu content (R² = 0.85, RPD = 2.12). The RF model demonstrated the next best performance, with its optimal inversion model also utilizing SD spectral data for predicting Cu content (R² = 0.76, RPD = 1.90). On the other hand, the MLR model exhibited the poorest performance and is unsuitable for predicting heavy metal content in the region using the measured spectral data. This study highlights the potential of spectral data in environmental monitoring and provides a technical reference for the inversion assessment and regulation of heavy metals in farmlands surrounding tailing sites.
Read full abstract