Abstract

AbstractLandslide susceptibility maps indicate the spatial distribution of landslide likelihood. Modeling susceptibility over large or diverse terrains remains a challenge due to the sparsity of landslide data (mapped extent of known landslides) and the variability in triggering conditions. Several different data sampling strategies of landslide locations used to train a susceptibility model are used to mitigate this challenge. However, to our knowledge, no study has systematically evaluated how different sampling strategies alter a model's predictor effects (i.e., how a predictor value influences the susceptibility output) critical to explaining differences in model outputs. Here, we introduce a statistical framework that examines the variation in predictor effects and the model accuracy (measured using receiver operator characteristics) to highlight why certain sampling strategies are more effective than others. Specifically, we apply our framework to an array of logistic regression models trained on landslide inventories collected at sub‐regional scales over four terrains across the United States. Results show significant variations in predictor effects depending on the inventory used to train the models. The inconsistent predictor effects cause low accuracies when testing models on inventories outside the domain of the training data. Grouping test and training sets according to physiographic and ecological characteristics, which are thought to share similar triggering mechanisms, does not improve model accuracy. We also show that using limited landslide data distributed uniformly over the entire modeling domain is better than using dense but spatially isolated data to train a model for applications over large regions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call