Real-time identification of urban rainstorm waterlogging disasters based on Weibo big data

Yang Xiao,Zaiwu Gong,Beiqun Li

doi:10.1007/s11069-018-3427-4

Abstract

With the acceleration of urbanisation in China, preventing and reducing the economic losses and casualties caused by urban rainstorm waterlogging disasters have become a critical and difficult issue that the government is concerned about. As urban storms are sudden, clustered, continuous, and cause huge economic losses, it is difficult to conduct emergency management. Developing a more scientific method for real-time disaster identification will help prevent losses over time. Examining social media big data is a feasible method for obtaining on-site disaster data and carrying out disaster risk assessments. This paper presents a real-time identification method for urban-storm disasters using Weibo data. Taking the June 2016 heavy rainstorm in Nanjing as an example, the obtained Weibo data are divided into eight parts for the training data set and two parts for the testing data set. It then performs text pre-processing using the Jieba segmentation module for word segmentation. Then, the term frequency–inverse document frequency method is used to calculate the feature items weights and extract the features. Hashing algorithms are introduced for processing high-dimensional sparse vector matrices. Finally, the naive Bayes, support vector machine, and random forest text classification algorithms are used to train the model, and a test set sample is introduced for testing the model to select the optimal classification algorithm. The experiments showed that the naive Bayes algorithm had the highest macro-average accuracy.

Full Text