The improvement of landslide susceptibility assessment is a long-standing problem in hazard mitigation work, wherein previous studies have proposed various training models. However, the ratio of positive to negative samples and the selection of non-landslide samples have been shown to significantly influence results. These research directions have traditionally been focal points, while datasets are often overlooked, serving merely as auxiliary tools to support the validation process. Hence, this study proposes an approach to enhance datasets through the introduction of the side-sampling method. This technique focuses on individual research cells, conducting feature sampling training on fixed regions of length M, thereby enabling more precise identification of geographical clustering characteristics. Using evaluation metrics such as accuracy, precision, recall, F1 score, and ROC curve, this study conducts a comparative analysis between the side-sampling method and traditional sampling methods, using three distinct railway lines in China as the study areas. Results show substantial improvements beyond several exceptions: accuracy (+7.68%), precision (+7.19%), recall (+13.48%), F1 score (+9.92%), and ROC (+6.22%). The results demonstrate a significant overall improvement in the performance of the trained models based on the side-sampling method, providing a positive insight into mitigating landslide hazards along railways from the dataset perspective.
Read full abstract7-days of FREE Audio papers, translation & more with Prime
7-days of FREE Prime access