Bio-Inspired Algorithm Based Undersampling Approach and Ensemble Learning for Twitter Spam Detection

K Kiruthika Devi,G A Sathish Kumar

doi:10.1142/s0218488524500016

Abstract

Currently, social media networks such as Facebook and Twitter have evolved into valuable platforms for global communication. However, due to their extensive user bases, Twitter is often misused by illegitimate users engaging in illicit activities. While there are numerous research papers available that delve into combating illegitimate users on Twitter, a common shortcoming in most of these works is the failure to address the issue of class imbalance, which significantly impacts the effectiveness of spam detection. Few other research works that have addressed class imbalance have not yet applied bio-inspired algorithms to balance the dataset. Therefore, we introduce PSOB-U, a particle swarm optimization-based undersampling technique designed to balance the Twitter dataset. In PSOB-U, various classifiers and metrics are employed to select majority samples and rank them. Furthermore, an ensemble learning approach is implemented to combine the base classifiers in three stages. During the training phase of the base classifiers, undersampling techniques and a cost-sensitive random forest (CS-RF) are utilized to address the imbalanced data at both the data and algorithmic levels. In the first stage, imbalanced datasets are balanced using random undersampling, particle swarm optimization-based undersampling, and random oversampling. In the second stage, a classifier is constructed for each of the balanced datasets obtained through these sampling techniques. In the third stage, a majority voting method is introduced to aggregate the predicted outputs from the three classifiers. The evaluation results demonstrate that our proposed method significantly enhances the detection of illegitimate users in the imbalanced Twitter dataset. Additionally, we compare our proposed work with existing models, and the predicted results highlight the superiority of our spam detection model over state-of-the-art spam detection models that address the class imbalance problem. The combination of particle swarm optimization-based undersampling and the ensemble learning approach using majority voting results in more accurate spam detection.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Bio-Inspired Algorithm Based Undersampling Approach and Ensemble Learning for Twitter Spam Detection

Abstract

Talk to us

Similar Papers

More From: International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems

Lead the way for us

Similar Papers

Imbalance Learning and Its Application on Medical Datasets
Yachao Shao
-
Yachao ShaoYachao Shao
21 Feb 2022
21 Feb 2022

Performance Evaluation of Sentiment Analysis on Balanced and Imbalanced Dataset Using Ensemble Approach
Shini George ... V Srividhya
Indian Journal of Science and Technology | VOL. 15
Shini George, et. al.Shini George ... V Srividhya
05 May 2022
Indian Journal of Science and Technology | VOL. 15

Addressing the class imbalance problem in Twitter spam detection using ensemble learning
Shigang Liu ... Yang Xiang
Computers & Security | VOL. 69
Shigang Liu, et. al.Shigang Liu ... Yang Xiang
13 Dec 2016
Computers & Security | VOL. 69

Noise-adaptive synthetic oversampling technique
Minh Thanh Vo ... Trang Nguyen
Applied Intelligence | VOL. 51
Minh Thanh Vo, et. al.Minh Thanh Vo ... Trang Nguyen
18 Mar 2021
Applied Intelligence | VOL. 51

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Bio-Inspired Algorithm Based Undersampling Approach and Ensemble Learning for Twitter Spam Detection

Abstract

Talk to us

Similar Papers

More From: International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems