Abstract

Web text, using natural language to describe a disaster event, contains a considerable amount of disaster information. Automatic extraction from web text of this disaster information (e.g., time, location, casualties, and disaster losses) is an important supplement to conventional disaster monitoring data. This study extracted and compared the characteristics of earthquake disaster information from web news media reports (news reports) and online disaster reduction agency reports (professional reports). Using earthquakes in China from 2015 to 2017 as a case study, a series of rules were created for extracting earthquake event information, including temporal extraction rules, a location trigger dictionary, and an attribute trigger dictionary. The differences in characteristics of news reports and professional reports were investigated in terms of their quantity and spatiotemporal distribution through statistical analysis, geocoding, and kernel density estimation. The information extracted from each set of reports was also compared with authoritative data. The results indicated that news reports are more extensive and have richer information. In contrast, professional reports are less repetitive as well as more accurate and standardized, mainly focusing on earthquakes with Ms ≥ 4 and/or earthquakes that may cause damage. These characteristics of disaster information from different web texts sources can be used to improve the efficiency and analysis of disaster information extraction. In addition, the rule-based approach proposed herein was found to be an accurate and viable way to extract earthquake information from web texts. The approach provided the technical basics and background information to support further research seeking human-centric disaster information, which cannot be acquired using traditional instrument monitoring methods, from web text.

Highlights

  • In recent years, the Internet has become the foremost means for disseminating information and knowledge and has enabled the government, professional organizations, news media, and even the affected population to quickly publicize an overwhelming amount of disaster information

  • This study explored the difference between web news media reports and online disaster reduction agency reports ( ‘professional reports’) on disaster information

  • For general earthquakes (4.0 ≤ Ms < 5.0), there were 189 earthquake events found in news reports, 104 earthquake events in professional reports, and 308 earthquakes from China Earthquake Networks Centre (CENC)

Read more

Summary

Introduction

The Internet has become the foremost means for disseminating information and knowledge and has enabled the government, professional organizations, news media, and even the affected population to quickly publicize an overwhelming amount of disaster information. Being multi-source, dynamic and heterogeneous, web text is a useful source of data to improve emergency response and strengthen disaster information acquisition. Numerous studies have explored the extraction and analysis of disaster information from web text. Very few studies have focused on the influence of different web text (e.g., news articles, official reports, and microblogs) on the extracted disaster information. The structure of sentences, the information concerns, and the reporting perspectives vary among different Web texts sources [2]. How to make effective use of multiple web texts to assist disaster management, based on their diverse features, remains an unresolved question. The characteristics of disaster information from varied web texts need to be further explored

Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.