Predicting the natural distribution of heavy metals (HMs) in soil is important to understand the potential risk of pollution. However, suitable technologies are still lacking for wide scale due to the large spatial heterogeneity. In this study, we developed machine learning models for predicting natural contents of five typical HMs in soil, including As, Cd, Cr, Hg and Pb. It was found that the optional random forest (RF) model had the best performance with the R2 up to 0.64. Based on this model, potential distribution of the five HMs explored that elevated contents were mainly concentrated in the southwest and south central of China. Feature analysis illustrated that importance of natural factors followed the order of geological attributes > soil properties > climatic conditions > ecological functions. In particular, lithology of the parent material dominated the content of metals, with the contributions of 18 - 25%. Moreover, soil properties of pH, cation exchange capacity, profile depth of soil and vegetation coverage had different influences on HMs, due to the variability in the properties of different HMs. This study developed a mapping relationship between natural factors and soil HMs by data science method, which may provide instructive information for pollution control and planning decisions.
Read full abstract