Landslide Susceptibility Prediction Using Machine Learning Methods: A Case Study of Landslides in the Yinghu Lake Basin in Shaanxi

Sheng Ma,Saier Wu,Jian Chen,Yurou Li

doi:10.3390/su152215836

Abstract

Landslide susceptibility prediction (LSP) is the basis for risk management and plays an important role in social sustainability. However, the modeling process of LSP is constrained by various factors. This paper approaches the effect of landslide data integrity, machine-learning (ML) models, and non-landslide sample-selection methods on the accuracy of LSP, taking the Yinghu Lake Basin in Ankang City, Shaanxi Province, as an example. First, previous landslide inventory (totaling 46) and updated landslide inventory (totaling 46 + 176) were established through data collection, remote-sensing interpretation, and field investigation. With the slope unit as the mapping unit, twelve conditioning factors, including elevation, slope, aspect, topographic relief, elevation variation coefficient, slope structure, lithology, normalized difference vegetation index (NDVI), normalized difference built-up index (NDBI), distance to road, distance to river, and rainfall were selected. Next, the initial landslide susceptibility mapping (LSM) was obtained using the K-means algorithm, and non-landslide samples were determined using two methods: random selection and semi-supervised machine learning (SSML). Finally, the random forest (RF) and artificial neural network (ANN) machine-learning methods were used for modeling. The research results showed the following: (1) The performance of supervised machine learning (SML) (RF, ANN) is generally superior to unsupervised machine learning (USML) (K-means). Specifically, RF in the SML model has the best prediction performance, followed by ANN. (2) The selection method of non-landslide samples has a significant impact on LSP, and the accuracy of the SSML-based non-landslide selection method is controlled by the ratio of the number of landslide samples to the number of mapping units. (3) The quantity of landslides has an impact on how reliably the results of LSM are obtained because fewer landslides result in a smaller sample size for LSM, which deviates from reality. Although the results in this dataset are satisfactory, the zoning results cannot reliably anticipate the recently added landslide data discovered by the interpretation of remote-sensing data and field research. We propose that the landslide inventory can be increased by remote sensing in order to achieve accurate and impartial LSM since the LSM of adequate landslide samples is more reasonable. The research results of this paper will provide a reference basis for uncertain analysis of LSP and regional landslide risk management.

Full Text