Abstract

Landslide susceptibility mapping has significantly progressed with improvements in machine learning techniques. However, the inventory/data imbalance (DI) problem remains one of the challenges in this domain. This problem exists as a good quality landslide inventory map, including a complete record of historical data, is difficult or expensive to collect. As such, this can considerably affect one’s ability to obtain a sufficient inventory or representative samples. This research developed a new approach based on generative adversarial networks (GAN) to correct imbalanced landslide datasets. The proposed method was tested at Chukha Dzongkhag, Bhutan, one of the most frequent landslide prone areas in the Himalayan region. The proposed approach was then compared with the standard methods such as the synthetic minority oversampling technique (SMOTE), dense imbalanced sampling, and sparse sampling (i.e., producing non-landslide samples as many as landslide samples). The comparisons were based on five machine learning models, including artificial neural networks (ANN), random forests (RF), decision trees (DT), k-nearest neighbours (kNN), and the support vector machine (SVM). The model evaluation was carried out based on overall accuracy (OA), Kappa Index, F1-score, and area under receiver operating characteristic curves (AUROC). The spatial database was established with a total of 269 landslides and 10 conditioning factors, including altitude, slope, aspect, total curvature, slope length, lithology, distance from the road, distance from the stream, topographic wetness index (TWI), and sediment transport index (STI). The findings of this study have shown that both GAN and SMOTE data balancing approaches have helped to improve the accuracy of machine learning models. According to AUROC, the GAN method was able to boost the models by reaching the maximum accuracy of ANN (0.918), RF (0.933), DT (0.927), kNN (0.878), and SVM (0.907) when default parameters used. With the optimum parameters, all models performed best with GAN at their highest accuracy of ANN (0.927), RF (0.943), DT (0.923) and kNN (0.889), except SVM obtained the highest accuracy of (0.906) with SMOTE. Our finding suggests that RF balanced with GAN can provide the most reasonable criterion for landslide prediction. This research indicates that landslide data balancing may substantially affect the predictive capabilities of machine learning models. Therefore, the issue of DI in the spatial prediction of landslides should not be ignored. Future studies could explore other generative models for landslide data balancing. By using state-of-the-art GAN, the proposed model can be considered in the areas where the data are limited or imbalanced.

Highlights

  • Landslides are a form of natural hazard that pose significant threats to the environment and society [1]

  • This paper introduced a new method of data balancing based on generative adversarial networks (GAN) for improving training in various machine learning models, i.e., artificial neural networks (ANN), random forests (RF), decision trees (DT), k-nearest neighbours (kNN), and support vector machine (SVM)

  • Spatial prediction of landslides is an important step for landslide risk assessment and planning mitigation measures

Read more

Summary

Introduction

Landslides are a form of natural hazard that pose significant threats to the environment and society [1]. In many parts of the world, landslides frequently occur due to single, high-intensity (e.g., shallow, fast-moving landslides) or prolonged (days or weeks, e.g., slow-moving deep-seated landslides) rainfall events, earthquakes, or human activity [2]. Landslides occur due to other mechanisms, such as volcanic eruptions, rapid snowmelt, and elevated water levels [3]. Landslides occur in mountainous areas at high frequencies relative to areas with low terrain. The higher the angle of inclination, the more dominant the gravity, which leading to “pulling” material down the slope. Landslides begin to occur when the resisting force exceeds the certain limit depending upon the strength of the material, the frictional properties between the slide material and the rock, or both

Objectives
Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.