China’s landslide disasters are serious, and regional landslide disaster early-warning is one of the important means of disaster prevention and mitigation. The traditional regional landslide disaster early-warning model, however, is limited by the complex landslide induction mechanism, limited data accumulation, and insufficient big data analysis methods, and has problems such as limited early-warning accuracy and insufficient refinement. In this paper, a machine learning method was introduced into the field of regional landslide disaster warning. From the model construction process of training sample-set construction, sample learning and training, model parameter optimization, model preservation, warning output, and so on, a method for constructing a regional landslide early-warning model based on machine learning was systematically proposed. In the sample learning and training, 80% of the training sample-set was used as the training set, and 20% was used as the test set for five-fold cross validation. The Bayesian Optimization algorithm was used to optimize the model parameters, and the accuracy, ROC curve, and AUC value were used to verify the model accuracy and model generalization ability. With China’s Fujian province as an example, based on nine years of geological and meteorological data (2010–2018), geological environment factors, factors of hazard-affected bodies and historical disaster situations, and rainfall-induced factors in four categories, a total of 26 indicators were used as input characteristic parameters. Six machine learning algorithms were adopted to improve model training; the results showed that the Random Forest algorithm performed the best, giving an accuracy of 92.3%, and was the model with the best generalization ability (AUC was 0.955). The second best was the Artificial Neural Network model, with an accuracy of 0.937 and an AUC of 0.935. Next were the Nearest Neighbor model, the Logistic Regression model, and the Support Vector Machine; the poorest results were from the Decision Tree model. Finally, the typical rainfall-type landslide disaster process in Fujian Province was selected as an example to verify the Random Forest algorithm model. The results showed that compared with the early-warning results of the original explicit statistical model, the hit rate of the new model was 6 times, or equal to that of the original model, and the landslide density in the early-warning area of the new model was 1.6–1.7 times that of the original model. Preliminary verification showed that the new model based on the Random Forest method has obvious advantages, a higher hit rate and a smaller warning area, and can achieve more accurate warnings. The follow-up will continue to track the new landslide disaster situation in the study area and carry out model verification and correction.
Read full abstract