Detecting Aggressiveness in Tweets: A Hybrid Model for Detecting Cyberbullying in the Spanish Language

Manuel Lepe-Faúndez,Christian Vidal-Castro,Alejandra Segura-Navarrete,Claudia Martínez-Araneda,Clemente Rubio-Manzano

doi:10.3390/app112210706

Manuel Lepe-Faúndez, Christian Vidal-Castro + Show 3 more

Open Access

https://doi.org/10.3390/app112210706

Copy DOI

Abstract

In recent years, the use of social networks has increased exponentially, which has led to a significant increase in cyberbullying. Currently, in the field of Computer Science, research has been made on how to detect aggressiveness in texts, which is a prelude to detecting cyberbullying. In this field, the main work has been done for English language texts, mainly using Machine Learning (ML) approaches, Lexicon approaches to a lesser extent, and very few works using hybrid approaches. In these, Lexicons and Machine Learning algorithms are used, such as counting the number of bad words in a sentence using a Lexicon of bad words, which serves as an input feature for classification algorithms. This research aims at contributing towards detecting aggressiveness in Spanish language texts by creating different models that combine the Lexicons and ML approach. Twenty-two models that combine techniques and algorithms from both approaches are proposed, and for their application, certain hyperparameters are adjusted in the training datasets of the corpora, to obtain the best results in the test datasets. Three Spanish language corpora are used in the evaluation: Chilean, Mexican, and Chilean-Mexican corpora. The results indicate that hybrid models obtain the best results in the 3 corpora, over implemented models that do not use Lexicons. This shows that by mixing approaches, aggressiveness detection improves. Finally, a web application is developed that gives applicability to each model by classifying tweets, allowing evaluating the performance of models with external corpus and receiving feedback on the prediction of each one for future research. In addition, an API is available that can be integrated into technological tools for parental control, online plugins for writing analysis in social networks, and educational tools, among others.

Highlights

The growing use of social networks has provided a channel to unrestrictedly express feelings and opinions on a mass scale
There is a smaller body of works that combine, in one way or another, the Machine Learning (ML) approach with the use of lexicons, for example, to have predefined lists of bad words that, once detected, are used as features in ML [13,15,17]
This article presented several hybrid models, whose idea is using the Lexicon and Machine Learning approach to analyze emotions in user comments, to detect aggression in texts written in Spanish. 5 approaches are proposed to create different models: Lexicon, TF_IDF_Lexicon, WE_Lexicon, WE_Lexicon_TF-IDF, and the Ensemble approach, which differentiate mainly in the way of extracting the feature vector from the text

Summary

Introduction

The growing use of social networks has provided a channel to unrestrictedly express feelings and opinions on a mass scale. This study detected the existence of a high percentage of related situations: between 3.5% and 58% of cyber-victims; and between 2.5% and 32% of cyberaggressors. There is a smaller body of works that combine, in one way or another, the ML approach with the use of lexicons, for example, to have predefined lists of bad words that, once detected, are used as features in ML [13,15,17]. In [17] was used exclusively the lexicon-based approach, including 9 bad words chosen by the authors considering their high frequency in situations labeled as Cyberbullying, applying a morphological analysis and information recovery techniques to determine the degree of aggression

Objectives

Methods

Results

Discussion

Conclusion

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied sciences	Publication Date: Nov 12, 2021
Citations: 8	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Detecting Aggressiveness in Tweets: A Hybrid Model for Detecting Cyberbullying in the Spanish Language

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences

Lead the way for us

Similar Papers

Plants meet machines: Prospects in machine learning for plant biology
Pamela S Soltis ... Gil Nelson
American Journal of Botany | VOL. 8
Pamela S Soltis, et. al.Pamela S Soltis ... Gil Nelson
01 Jun 2020
American Journal of Botany | VOL. 8

Detection and prevention of SQLI attacks and developing compressive framework using machine learning and hybrid techniques
Wubetu Barud Demilie ... Fitsum Gizachew Deriba
Journal of Big Data | VOL. 9
Wubetu Barud Demilie, et. al.Wubetu Barud Demilie ... Fitsum Gizachew Deriba
30 Dec 2022
Journal of Big Data | VOL. 9

Multimodal data for systolic and diastolic blood pressure prediction: The hypertension conscious artificial intelligence.
Quincy A Hathaway ... Partho P Sengupta
eBioMedicine | VOL. 84
Quincy A Hathaway, et. al.Quincy A Hathaway ... Partho P Sengupta
13 Sep 2022
eBioMedicine | VOL. 84

Improving Individual Brain Age Prediction Using an Ensemble Deep Learning Framework.
Chen-Yuan Kuo ... Pei-Lin Lee
Frontiers in psychiatry | VOL. 12
Chen-Yuan Kuo, et. al.Chen-Yuan Kuo ... Pei-Lin Lee
23 Mar 2021
Frontiers in psychiatry | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detecting Aggressiveness in Tweets: A Hybrid Model for Detecting Cyberbullying in the Spanish Language

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied sciences