Abstract

Over time, social media bots (SMBs), specifically political SMBs, have played a crucial role in influencing and spreading misinformation, manipulating public opinion, and harassing and intimidating users of online social networks (OSNs). This article aims to study previous works on the detection and analysis of political SMB activities and address critical challenges that significantly impact the effectiveness of SMB detection models. These challenges include feature engineering, feature selection (FS), and model implementation. Over 33 features were extracted from the Twibot-20 dataset, including content, user information, network, behavior, and temporal features. Various FS techniques are explored and compared to select the optimal features, comprising basic, filter, wrapper, embedded, and hybrid. The optimal features are then employed to train multiple machine-learning algorithms. To balance the dataset, the synthetic minority oversampling technique coupled with edited nearest neighbors (Smote-ENN) is used. The results showed an improvement in model performance, from an initial Area Under the Curve (AUC) of 90.40 % and accuracy of 81.60 % using the original set to a score of 99.50 % for the test set and 100 % for the training set in all used metrics. Decision Trees, Random Forest, Gradient Boosting, Adaboost, XGB, and Extra Trees emerge as the most effective for detecting political SMBs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.