Abstract

Numerous natural disasters due to climate change pose major threats to the sustainability of public infrastructure and human lives. For emergency rescue and recovery during a disaster, a rapid and accurate evaluation of disaster damage is essential. In recent years, the Transformer has gained popularity in a number of tasks related to computer vision, which offers tremendous potential for improving the accuracy of disaster damage assessments. Our research aims to determine whether Vision Transformer (ViT) can be used to assess natural disaster damage on high-resolution Unmanned Aerial Vehicle (UAV) data in comparison with conventional deep-learning semantic segmentation techniques. We discuss if Transformer can perform better than CNNs in accurately assessing the damage caused in order to bridge the gap. Detailed performance comparison of state-of-art deep learning semantic segmentation models (UNET, Segnet, PSPNet, Deeplabv3+) and Transformer framework (SegFormer) for damage assessment is presented. The experimentation is performed on both natural disaster damage datasets (RescueNet, FloodNet). The study supported SegFormer as the most appropriate model for estimating disaster damage, with mIoUs of 96% on the RescueNet dataset and 82.22% on the FloodNet dataset, respectively. The Transformer is capable of outperforming conventional segmentation CNNs in understanding the entirety of the scene and assessing the severity of the damage, based on both quantitative evaluation and visual results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call