Abstract
Excessive concentrations of Ni in soil have many severe effects, negatively affecting human health and leading to disease, while also posing a threat to animals and plants. Although the dangers of high Ni concentrations have been widely recognized, rapid and large-scale tools for the identification of Ni contamination are still lacking. Visible-near-infrared (Vis-NIR) spectroscopy has been employed to rapidly identify Ni contamination; however, previous studies suffer from issues inherent to small datasets and the tendency to negate data imbalances. To address these issues, a large dataset comprising 18,675 soil samples was used to predict soil Ni contamination by combining Vis-NIR data with machine learning (ML). The data imbalance inherent to previous studies was addressed using two data sampling methods. To build a robust classification model for Ni contamination, four spectral preprocessing methods and four ML algorithms were compared. The optimal extreme gradient boosting model achieved recall, accuracy, area under the curve, and geometric mean scores of 0.8203, 0.8806, 0.9268, and 0.8508, respectively. Model predictions across the United States identified specific regions with high possibility of Ni contamination. Overall, the model developed in this study offers an improved accuracy in predicting soil Ni contamination at the continental scale, and can be used to prioritize further testing and guide policymaking.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.