Abstract

Social media data are constantly updated, numerous, and characteristically prominent. To quickly extract the needed information from the data to address earthquake emergencies, a topic-words detection model of earthquake emergency microblog messages is studied. First, a case analysis method is used to analyze microblog information after earthquake events. An earthquake emergency information classification hierarchy is constructed based on public demand. Then, subject sets of different granularities of earthquake emergency information classification are generated through the classification hierarchy. A detection model of new topic-words is studied to improve and perfect the sets of topic-words. Furthermore, the validity, timeliness, and completeness of the topic-words detection model are verified using 2201 messages obtained after the 2014 Ludian earthquake. The results show that the information acquisition time of the model is short. The validity of the whole set is 96.96%, and the average and maximum validity of single words are 78% and 100%, respectively. In the Ludian and Jiuzhaigou earthquake cases, new topic-words added to different earthquakes only reach single digits in validity. Therefore, the experiments show that the proposed model can quickly obtain effective and pertinent information after an earthquake, and the complete performance of the earthquake emergency information classification hierarchy can meet the needs of other earthquake emergencies.

Highlights

  • Since 1980, China has been among the top five countries most frequently affected by damaging earthquakes [1]

  • This paper focuses on topic detection after earthquakes and uses a cross-validation method to construct an information classification system for earthquake emergencies based on Sina microblog data

  • The classification be corrected and29the messages need to be repeatedly reclassified until the pe hierarchy has summarized the main categories of the earthquake emergency information, comes less than 10%, and the final version of emergency information classific based on the microblog message

Read more

Summary

Introduction

Since 1980, China has been among the top five countries most frequently affected by damaging earthquakes [1]. Many studies on social media data collection, extraction, and analysis have been conducted to meet the requirements of natural disaster management, including earthquakes, floods, and typhoons [8,9,10,15,16,17]. The automatic information acquisition from the Internet is the first step in the organization and management of earthquake emergency information. An exhaustive literature review shows that the classification and information extraction from the perspective of public demand in the earthquake emergency response process have not yet been reported. This paper focuses on topic detection after earthquakes and uses a cross-validation method to construct an information classification system for earthquake emergencies based on Sina microblog data.

Data Sources
Data Preprocessing
Data preprocessing
Process of Establishing the Classification Hierarchy
Earthquake Emergency Information Classification Hierarchy
Topic-Words Detection Model Construction
Checking between Categories
Constructing Coarse-Grained and Fine-Grained Feature Word Sets
Fuzzy Feature Word Set Processing
Model Validation
Coarse-Grained and Fine-Grained Word Sets
Analysis of the Information Classification Validity
Analysis of the Information Collection Timeliness
Analysis on theexperiment
Analysis of the Topic-Word Set Completeness
Hot Topic-Word Application
Findings
Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.