Abstract

ABSTRACT With the development of social media, it has become increasingly important to quickly and accurately identify social media texts related to disasters (e.g. typhoon) to aid in rescue and recovery efforts. Currently, multi-class classification and pre-trained language model Bidirectional Encoder Representations from Transformers (BERT) are widely used for text classification. However, most studies on typhoon damage classification are multi-class single-label, which contradicts to the reality that a social media text may correspond to multiple types of damage. Moreover, the outputs of the hidden layers of BERT are not fully utilized. This paper proposes a two-stage multi-class multi-label classification method for typhoon damage assessment by fully integrating the outputs of the hidden layers of BERT. In the first stage, sentence vectors are adopted to identify typhoon damage-related texts. In the second stage, word matrices are applied for multi-class multi-label classification to further classify the texts into five damage categories (i.e. transportation, public, electricity, forestry, and waterlogging). The two stages are trained end-to-end to identify typhoon damage from social media texts. Experiments on SinaWeibo texts during typhoon landfall in Chinese coastal regions demonstrate that the proposed method can effectively improve the accuracy of text classification and comprehensively assess typhoon damage.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call