Abstract
The selection of a suitable discretization method (DM) to discretize spatially continuous variables (SCVs) is critical in ML-based natural hazard susceptibility assessment. However, few studies start to consider the influence due to the selected DMs and how to efficiently select a suitable DM for each SCV. These issues were well addressed in this study. The information loss rate (ILR), an index based on the information entropy, seems can be used to select optimal DM for each SCV. However, the ILR fails to show the actual influence of discretization because such index only considers the total amount of information of the discretized variables departing from the original SCV. Facing this issue, we propose an index, information change rate (ICR), that focuses on the changed amount of information due to the discretization based on each cell, enabling the identification of the optimal DM. We develop a case study with Random Forest (training/testing ratio of 7 : 3) to assess flood susceptibility in Wanan County, China. The area under the curve-based and susceptibility maps-based approaches were presented to compare the ILR and ICR. The results show the ICR-based optimal DMs are more rational than the ILR-based ones in both cases. Moreover, we observed the ILR values are unnaturally small (<1%), whereas the ICR values are obviously more in line with general recognition (usually 10%–30%). The above results all demonstrate the superiority of the ICR. We consider this study fills up the existing research gaps, improving the ML-based natural hazard susceptibility assessments.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have