The accuracy of soil heavy metal pollution mapping is heavily reliant on the sampling strategies utilized in both the preliminary and detailed survey stages of site investigations. This study introduces an entropy-informed multi-stage sampling design (EIMSD) method that leverages preliminary survey data as background information and utilizes relative entropy to progressively select sampling points in detailed surveys. Results indicate that the EIMSD method outperforms the grid sampling design (GSD) and conventional sampling design (CSD) methods across both hypothetical and real-world study areas. This superiority is evidenced by a notable rise in R2 values, ranging from 6.4% to 60.4% and a decrease in RMSE values from 16.7% to 54.0% relative to GSD, and a similar trend with an increase in R2 values from 6.7% to 44.1% and a reduction in RMSE values from 16.5% to 39.7% when compared to CSD. This study also investigates the optimal configurations for EIMSD, focusing on the number of detailed sampling points per stage (Nadd) and the ratio of preliminary-to-detailed survey sample sizes (Np/Nd) when the sum of Np and Nd is held constant. Our findings highlight that adding one detailed sampling point per stage (Nadd = 1) is the most effective. For areas with strong spatial variability, a larger Np/Nd value of approximately 3/2 is recommended, whereas a ratio close to 1 is apt for areas with moderate variability. Conversely, for areas with weak variability, a smaller Np/Nd value of about 2/3 is advised. EIMSD provides a more detailed and accurate map of soil heavy metal contamination, facilitating more targeted and effective remediation strategies.
Read full abstract