Abstract

Twitter enables millions of active users to send and read concise messages on the internet every day. Yet some people use Twitter to propagate violent and threatening messages resulting in cyberbullying. Previous research has focused on whether cyberbullying behavior exists or not in a tweet (binary classification). In this research, we developed a model for detecting the severity of cyberbullying in a tweet. The developed model is a feature-based model that uses features from the content of a tweet, to develop a machine learning classifier for classifying the tweets as non-cyberbullied, and low, medium, or high-level cyberbullied tweets. In this study, we introduced pointwise semantic orientation as a new input feature along with utilizing predicted features (gender, age, and personality type) and Twitter API features. Results from experiments with our proposed framework in a multi-class setting are promising both with respect to Kappa (84%), classifier accuracy (93%), and F-measure (92%) metric. Overall, 40% of the classifiers increased performance in comparison with baseline approaches. Our analysis shows that features with the highest odd ratio: for detecting low-level severity include: age group between 19–22 years and users with <1 year of Twitter account activation; for medium-level severity: neuroticism, age group between 23–29 years, and being a Twitter user between one to two years; and for high-level severity: neuroticism and extraversion, and the number of times tweet has been favorited by other users. We believe that this research using a multi-class classification approach provides a step forward in identifying severity at different levels (low, medium, high) when the content of a tweet is classified as cyberbullied. Lastly, the current study only focused on the Twitter platform; other social network platforms can be investigated using the same approach to detect cyberbullying severity patterns.

Highlights

  • Cyberbullying is a conscious and persistent act of violence that aims to threaten or harm individuals, deliberately and repeatedly using communication and information technologies

  • Twitter allows a user post comments known as tweets; photos are posted on Instagram; whereas Facebook offers users the ability to post text, images, and videos [14]

  • It can be seen that the Tree classifiers Decision Tree (DT), and Random Forest (RF), performed significantly higher than function classifier Support Vector Machine (SVM), probabilistic classifier Naïve Bayes (NB), and lazy learning instance-based classifier K-Nearest Neighbors (KNN)

Read more

Summary

Introduction

Cyberbullying is a conscious and persistent act of violence that aims to threaten or harm individuals, deliberately and repeatedly using communication and information technologies. The emergence and increased use of the internet, especially Twitter and Facebook, have exacerbated this situation [1]. Twitter allows a user post comments known as tweets; photos are posted on Instagram; whereas Facebook offers users the ability to post text, images, and videos [14]. Twitter enables millions of active users to send and read concise informative messages on a website every day. Some people use Twitter to propagate violent and threatening messages. Cyberbullying is steadily increasing, partly due to the accelerated growth rate of online communications platforms, especially among young people. The development of efficient and effective methods for detection of online phenomena on Twitter is difficult due to: (i) informal language and short text of a tweet; (ii) provision of fairly limited context in each tweet (iii) prevalence of bots or spam accounts

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call