Soil heavy metal chromium pollution poses significant threats to human health and ecosystems, necessitating accurate quantitative prediction methods for effective monitoring and management. This study aims to develop robust predictive models for soil chromium content in farmland soils of Mojiang Hani Autonomous County, Pu’er City, Yunnan Province, China. These models utilize ensemble learning techniques based on visible and near-infrared spectroscopy. Operations before model building involved partitioning datasets with the Kennard-Stone algorithm to ensure representative training and testing sets. Visible and near-infrared spectroscopy-data preprocessing was performed using Savitzky-Golay smoothing and first-order derivative transformations to enhance signal quality. Bands selection was achieved through the Successive Projections Algorithm (SPA), effectively reducing data dimensionality and collinearity. Six ensemble learning models were constructed and assessed for their predictive performance: Bagging-DTR, Random Forest (RF), Adaboost-DTR, XGBoost-DTR, Stacking-1, and Stacking-2. These models utilized Decision Trees (DTR) and Linear Regression (LR) as base learners. Results demonstrated that ensemble models significantly outperformed individual base learners. Notably, the Stacking-2 model achieved the highest accuracy with an R^2 of 0.954, RMSE of 125.967 mg/kg, and RPD of 4.667. To validate the model’s practical applicability, spatial interpolation of soil Cr content was conducted using the Kriging method based on Stacking-2 model predictions. The spatial distribution maps of measured and predicted values exhibited high congruence, underscoring the model’s effectiveness in accurately mapping Cr distribution across the study area. This study underscores the efficacy of integrating ensemble learning with visible and near-infrared spectroscopy-data preprocessing and SPA for precise soil heavy metal prediction. The findings offer valuable insights and a scientific basis for enhanced soil quality monitoring, environmental risk assessment, and informed agricultural land management and pollution control.
Read full abstract