Abstract

Recently, the use of the citizen-sensors (people generating and sharing real data by social media) for detecting and disseminating emergency events in real-time have shown a considerable increase because people at the place of the event, as well as elsewhere, can quickly post relevant information on this type of alerts. Here, we present an emergency events dataset called UrbangEnCy. The dataset contains over 25500 texts in Spanish posted on Twitter from January 19th to August 19th, 2020, with emergencies and non-emergencies related content in Ecuador. We obtained, cleaned and, filtered these tweets and, then we selected the location and temporal data as well as tweet content. Besides, the data set includes annotations regarding the type of tweet (emergency / non-emergency) as well as additional nomenclature used to describe emergencies in the Center for immediate response service to emergencies (ECU 911) of Ecuador and international emergency services agencies (ESAs). UrbangEnCy dataset facilitates evaluating data science performance, machine learning, and natural language processing algorithms used with supervised and unsupervised problems re- related to text mining and pattern recognition. The dataset is freely and publicly available at https://doi.org/10.17632/4x37zz82k8.

Highlights

  • The use of the citizen-sensors for detecting and disseminating emergency events in real-time have shown a considerable increase because people at the place of the event, as well as elsewhere, can quickly post relevant information on this type of alerts

  • The data set includes annotations regarding the type of tweet as well as additional nomenclature used to describe emergencies in the Center for immediate response service to emergencies (ECU 911) of Ecuador and international emergency services agencies (ESAs)

  • If a tweet is a real emergency event, it is classified by both ESAs and ECU 911 nomenclatures into category4, category2, and category3 variables, respectively

Read more

Summary

Data Description

The dataset provides tweets posted by citizen sensors on Twitter. These posts contain information about possible emergency events reported in Ecuador during January and August 2020. If a tweet is a real emergency event, it is classified by both ESAs and ECU 911 nomenclatures into category, category, and category variables, respectively It is a unique identifier for each tweet. It is the city reported in the users profile who posted the tweet It corresponds to the ECU 911 Center, where the place is located. Note that the real emergencies correspond only to 1491 tweets, and for each emergency, there are levels of detail according to the International (ESA) and Ecuador (ECU 911) nomenclature. We noticed that most emergency events were reported by citizen sensors with Twitter accounts located in the ECU 911 Centers of Samborondón and Quito. The dataset construction process consisted of two stages: data acquisition and annotation

Data acquisition
Annotation process
Findings
Ethics Statement
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call