Online social networks have become important sources of information and contextual data in all areas of life, including finance, elections, social events, health, sports, etc. Recently, the detection and classification of useful events presented in tweets has attracted a lot of interest. However, due to the inherent challenges associated with the nature of the events to be detected or classified, traditional approaches have not yielded satisfactory results. The use of deep learning-based text word embedding representations, such as Word2Vec, GloVe, FastText, and BERT, has shown significant efficacy in improving detection performance by considering the semantic context. This study proposes a model that uses an LSTM stacked on top of BERT representations to effectively detect and classify events in tweets. To this end, a dataset of about 310,000 event-related tweets has been collected and categorized into 50 event types based on a selected set of representative keywords. Multiple experiments were carried out on the collected dataset to evaluate the performance of the proposed model. The proposed model attained an overall accuracy greater than 94.3% and an F1 score of more than 90%, achieving state-of-the-art results in the classification of most of the event categories.
Read full abstract