Abstract

Fast and efficient landslide detection plays an important role in post-disaster rescue and risk assessment. Existing convolution neural network (CNN) based landslide detection methods are difficult to exploit global long-distance dependencies due to limited receptive fields. Considering that landslide occurrence is susceptible to local and global conditions, we propose a novel multi-scale feature fusion scene parsing (MFFSP) framework to explore information at different scales by coupling CNN with Transformer to learn local and global clues for landslide detection based on satellite data. In the encoder, we design three modules, visual geometry module (VGM), residual learning module (RLM), and Transformer module (TRM) to exploit multi-scale features. Specifically, VGM and RLM are constructed based on convolution operations to explore local features by learning low-level and middle-level information, while TRM is built based on self-attention mechanism to learn long-distance dependencies. In the decoder, TRM and VGM are further extended to motivate the model to mine long-distance dependencies and detailed spatial information by deeply fusing features from multiple scales. To demonstrate the performance of the model, we employ two study areas with four test regions to conduct experiments and compare with seven state-of-the-art deep learning models. Extensive experiments demonstrate that MFFSP greatly outperforms other algorithms. In addition, we conduct numerous ablation experiments, proving that MFFSP fully combines the complementary advantages of CNN and Transformer to mine robust features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call