Abstract

Pollutants in the soil of industrial site are often highly heterogeneously distributed, which brought a challenge to accurately predict their three-dimensional (3D) spatial distributions. Here we attempt to create effective 3D prediction models using machine learning (ML) and readily attainable multisource auxiliary data for improving the prediction accuracy of highly heterogeneous Zn in the soil of a small-size industrial site. Using raw covariates from functional area layout, stratigraphic succession, and electrical resistivity tomography, and derived covariates of the raw covariates as predictors, we created 6 individual and 2 ensemble models for Zn, based on ML algorithms such as k-nearest neighbors, random forest, and extreme gradient boosting, and the stacking approach in ensemble ML. Results showed that the overall 3D spatial patterns of Zn predicted by individual and ensemble ML models, inverse distance weighting (IDW), and ordinary Kriging (OK) were similar, but their predictive performances differed significantly. The ensemble model with raw and derived covariates had the highest accuracy in representing the complex 3D spatial patterns of Zn (R2 = 0.45, RMSE = 344.80 mg kg−1), compared to the accuracies of individual ML models (R2 = 0.27–0.44, RMSE = 396.75–348.56 mg kg−1), OK (R2 = 0.33, RMSE = 381.12 mg kg−1), and IDW interpolation (R2 = 0.25, RMSE = 402.94 mg kg−1). Besides, the prediction accuracy gains of incorporating derived covariates were higher than adopting ensemble ML instead of single ML algorithm. These results highlighted the importance of developing derived covariates whilst adopting ML in predicting the 3D distribution of highly heterogeneous pollutant in the soil of small-size industrial site.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.