Abstract
One important step in binary modeling of environmental problems is the generation of absence-datasets that are traditionally generated by random sampling and can undermine the quality of outputs. To solve this problem, this study develops the Absence Point Generation (APG) toolbox which is a Python-based ArcGIS toolbox for automated construction of absence-datasets for geospatial studies. The APG employs a frequency ratio analysis of four commonly used and important driving factors such as altitude, slope degree, topographic wetness index, and distance from rivers, and considers the presence locations buffer and density layers to define the low potential or susceptibility zones where absence-datasets are generated. To test the APG toolbox, we applied two benchmark algorithms of random forest (RF) and boosted regression trees (BRT) in a case study to investigate groundwater potential using three absence datasets i.e., the APG, random, and selection of absence samples (SAS) toolbox. The BRT-APG and RF-APG had the area under receiver operating curve (AUC) values of 0.947 and 0.942, while BRT and RF had weaker performances with the SAS and Random datasets. This effect resulted in AUC improvements for BRT and RF by 7.2, and 9.7% from the Random dataset, and AUC improvements for BRT and RF by 6.1, and 5.4% from the SAS dataset, respectively. The APG also impacted the importance of the input factors and the pattern of the groundwater potential maps, which proves the importance of absence points in environmental binary issues. The proposed APG toolbox could be easily applied in other environmental hazards such as landslides, floods, and gully erosion, and land subsidence.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.