Abstract

Many social media platforms have emerged as a result of the online social network's (OSN) rapid expansion. They have become important in day-to-day life, and spammers have turned their attention to them. Spam detection is done in two different ways, such as machine learning (ML) and expert-based detection. The expert-based detection technique’s accuracy depends on expert knowledge, and the manual process is a time-consuming task. Thus, ML-based spam detection is preferred in OSN. Spam identification on social networks is a difficult operation involving a variety of factors, and spam and ham have resulted in an imbalanced data distribution, which gives an advantage to spammers for corrupting our devices. Spam detection based on ML algorithms like Logistic Regression (LR), K-Nearest Neighbour (KNN), Decision Trees (DT), Random Forest (RF), Support Vector Machine (SVM), and XGB, Voting Classifier (VC), and many other algorithms are used to design the address balance and to attain high assessment accuracy. There is a non-balance issue. Text is vectorized by vectorizers and all the relative results are stored. The experimental result shows that, as compared to KN, NB, ETC, RF, SVC, LR, XGB, and DT, the proposed VC provides a higher classification accuracy rate of 97.96%. The proposed methods are effective in identifying balanced and imbalanced datasets, as evidenced by the validation results. The website was created to detect messages as spam or not.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call