Abstract

In modern times, ensuring social security has become the prime concern for security administrators. The widespread and recurrent use of social media sites is creating a huge risk for the lives of the general people, as these sites are frequently becoming potential sources of the organization of various types of immoral events. For protecting society from these dangers, a prior detection system which can effectively detect events by analyzing these social media data is essential. However, automating the process of event detection has been difficult, as existing processes must account for diverse writing styles, languages, dialects, post lengths, and et cetera. To overcome these difficulties, we developed an effective model for detecting events, which, for our purposes, were classified as either protesting, celebrating, religious, or neutral, using Bengali and Banglish Facebook posts. At first, the collected posts’ text were processed for language detection, and then, detected posts were pre-processed using stopwords removal and tokenization. Features were then extracted from these pre-processed texts using three sub-processes: filtering, phrase matching of specific events, and sentiment analysis. The collected features were ultimately used to train our Bernoulli Naive Bayes classification model, which was capable of detecting events with 90.41% accuracy (for Bengali-language posts) and 70% (for the Banglish-form posts). For evaluating the effectiveness of our proposed model more precisely, we compared it with two other classifiers: Support Vector Machine and Decision Tree.

Highlights

  • The scale and interactivity of social media sites result in the generation of a massive volume of data in the form of audio, video, text, and images relevant to users’ personal, social, political, and economic lives

  • For showing the effectiveness of our performance evaluation, our event-detection model was tested with the Support Vector Machine (SVM) and Decision Tree (DT) classifiers on the same dataset

  • The performances of the SVM and DT models were assessed using the same dataset for demonstrating the efficiency of our model, with Table 8 comparing the precision, recall, F1-score, and accuracy of all three models

Read more

Summary

Introduction

The scale and interactivity of social media sites result in the generation of a massive volume of data in the form of audio, video, text, and images relevant to users’ personal, social, political, and economic lives. Numerous researchers have examined traffic [5,6], disaster [7,8], disease [7,9], sporting [10], earthquake [8], and crime events [10], to name but a few These studies have examined events described in English [11,12], Hindi [12], Mandarin [13], Urdu [14], Japanese [8], Korean [15], Arabic [16], and other languages, and have examined data gathered from different platforms to conduct their research including Twi er [17,18,19] or Sina Weibo [20,21]. These studies focused on different languages and platforms from different countries, none of them have analyzed

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.