Detection of Harassment Type of Cyberbullying: A Dictionary of Approach Words and Its Impact

Syed Mahbub,Eric Pardede,A S M Kayes,Shehzad Ashraf Chaudhry

doi:10.1155/2021/5594175

Syed Mahbub, Eric Pardede + Show 2 more

Open Access

https://doi.org/10.1155/2021/5594175

Copy DOI

Journal: Security and Communication Networks	Publication Date: Jun 4, 2021
Citations: 12	License type: CC BY 4.0

Affiliation: La Trobe University

Abstract

The purpose of this paper is to analyse the effects of predatory approach words in the detection of cyberbullying and to propose a mechanism of generating a dictionary of such approach words. The research incorporates analysis of chat logs from convicted felons, to generate a dictionary of sexual approach words. By analysing data across multiple social networks, the study demonstrates the usefulness of such a dictionary of approach words in detection of online predatory behaviour through machine learning algorithms. It also shows the difference between the nature of contents across specific social network platforms. The proposed solution to detect cyberbullying and the domain of approach words are scalable to fit real-life social media, which can have a positive impact on the overall health of online social networks. Different types of cyberbullying have different characteristics. However, existing cyberbullying detection works are not targeted towards any of these specific types. This research is tailored to focus on sexual harassment type of cyberbullying and proposes a novel dictionary of approach words. Since cyberbullying is a growing threat to the mental health and intellectual development of adolescents in the society, models targeted towards the detection of specific type of online bullying or predation should be encouraged among social network researchers.

Highlights

In order to address this limitation of the current detection techniques, we propose a model that performs textual analysis as a base for training a learning model and generates a dictionary of approach words to identify sexual harassment types of cyberbullying more accurately. e positive implications of such a detection
Our research investigates the contents of online social networks (OSNs) to answer the above research questions
For experiment A, the presence of approach words was not included in the feature space, whereas for experiment B, the binary feature vector was generated using a feature space that contained the presence of approach words. ese experiments were designed to identify the effect of the approach words in the feature space on the overall model performance. e FormSpring dataset initially contained 6.88% of positive cyberbullying instances and an oversampling of factor 5 increased the percentage of positive cyberbullying instances to 34.4%