In the coseismic landslide hazard assessment (CLHA), advanced machine learning (ML) models have garnered significant attention due to their effectiveness in handling the complex relationships between landslides and various influencing factors. However, explaining the decision-making mechanisms of machine learning models that predict landslide spatial distribution based on these influencing factors remains challenging. This study compares the predictive performance of four models—XGBoost, RF, LR, and SVM—optimized using the TPE algorithm, and introduces the SHAP algorithm into the XGBoost model to achieve both global and local interpretations of CLHA. In various tests, the optimized XGBoost model demonstrated the best predictive performance, achieving an accuracy of 0.864 and an AUC value of 0.886. Global interpretations indicate that the occurrence of coseismic landslides is primarily influenced by triggering factors such as hypocentral distance and distance from the seismogenic fault. Terrain roughness and elevation, on the other hand, make significant contributions among the conditioning factors. Single-factor dependence plots indicate that the contribution of individual factors to landslides varies across different ranges of their feature values. Analysis of two-factor dependence plots reveals that interactions between factors are also crucial in influencing the occurrence of landslides. Combining field surveys with local interpretations confirms significant variations in the contributions of influencing factors within local ranges. The main innovation of this study lies in the integration of the SHAP algorithm into the CLHA model, revealing the decision-making mechanisms of the model for spatial prediction of coseismic landslides.
Read full abstract