The uncertainty arising from random sampling of non-debris flow samples significantly impacts the accuracy of debris flow susceptibility assessments (DFSA). This study introduces a novel uncertainty elimination method, Kernel Density Estimation (KDE), and compares it with Mean and Maximum Probability Analysis (MPA) methods. Furthermore, we investigate the responses of four commonly used machine learning models to sampling uncertainty, comparing two structurally similar models (Random Forest (RF) and Extremely Randomized Trees (ERT)) with two structurally different models (Support Vector Machine (SVM) and Multilayer Perceptron (MLP)). The results indicate that the application of these uncertainty elimination methods can significantly enhance AUC values and zoning accuracy, with the KDE method outperforming the others. Specifically, the AUC values based on KDE for RF, ERT, SVM, and MLP are 0.995, 0.999, 0.999, and 0.853, respectively. The corresponding zoning accuracy for these models is 1.00, 1.00, 1.00, and 0.78, respectively. The study further reveals that the responses to sampling uncertainty vary by model architecture: RF, ERT, and SVM typically exhibit bimodal normal distributions, while the MLP model shows a unimodal distribution. Additionally, MLP is more sensitive to variations in negative samples, whereas RF and ERT are less affected due to the ensemble structure.
Read full abstract