Abstract
The objective of this study is to propose and test a hybrid machine learning pipeline to uncover the unfolding of disaster events corresponding to different locations from social media posts during disasters. Effective disaster response and recovery require a comprehensive understanding of disaster situations, i.e., unfolding of disaster events and geographic distribution of the disruptions. Existing studies have employed machine learning methods to conduct coarse-grained event detection and analyze the geographical location information from geotagged social media data. However, only a very small fraction of the entire set of social media data includes geotagged information, which may not directly correspond to events described in the content of posts. In addition, the coarse-grained information detected by existing approaches is token-based, which does not provide sufficient information for situation awareness. Hence, the detection of location and finer-grained event information could significantly improve the utility, credibility, and interpretability of social media data for situation awareness. To address these limitations, this study proposed a hybrid machine learning pipeline that makes use of all relevant tweets to uncover the evolution of disaster events across different locations. The pipeline integrates Named Entity Recognition for detecting locations mentioned in the posts, location fusion approach to extract coordinates of the locations and remove noise information, fine-tuned BERT model for classifying posts with humanitarian categories, and graph-based clustering to identify credible situational information. The application of the study is demonstrated using the data set collected from Twitter during the 2017 Hurricane Harvey in Houston. The results show the capability of the proposed hybrid pipeline for automated mapping of events across time and space from social media posts with considerable accuracy. The findings also suggest that the potential for forensic analysis of disasters using mapped events and their evolution, and based on the variation of social media attention to different locations in disasters. Hence, this method could provide a useful tool to support emergency managers, public officials, residents, first responders, and other stakeholders in rapid situation awareness across time and space.
Highlights
Natural disasters such as hurricanes, wildfire, and earthquakes cause large-scale disruptions over affected areas [1]
CONCLUDING REMARKS This study presented a hybrid machine learning pipeline to automatically map the evolution of disaster events across different locations using social media posts
The proposed hybrid pipeline integrates named entity recognition, location positioning and fusion, fine-tuned BERT-based classifier, and graph-based clustering. This pipeline has two important capabilities: (1) ability to detect credible situational information for a location in evolving disaster conditions on social media; and (2) mapping the geographic distribution of social media attention in the disaster-affected area, which have not been achieved in previous studies
Summary
Natural disasters such as hurricanes, wildfire, and earthquakes cause large-scale disruptions over affected areas [1]. Mouzannar et al proposed a multimodal deep learning framework to identify damagerelated information [12] The development of these techniques enables detecting disaster events using posts shared on social media. Another stream of studies has focused on estimating the geographical information for disaster-related tweets in order to assess disaster damages over different locations. Better techniques are needed to improve the accuracy and resolution of location information obtained from social media data To address this gap, this study proposed a hybrid machine learning pipeline to extract credible and interpretable situational information for understanding the evolution of disaster events and locations from social media posts. A case study of the 2017 Hurricane Harvey in Houston was conducted to examine the capabilities of the proposed pipeline in automated mapping of events and locations
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have