Abstract

The proliferation of spam messages has had a detrimental impact on users' experience of emails and social media. Consequently, it is imperative to implement effective spam filtering mechanisms to enhance online experiences. Internet companies have leveraged machine learning algorithms to detect and thwart spam messages. Given the multitude of popular social media platforms, it is critical to evaluate the efficacy of prevalent machine learning algorithms across diverse online platforms. This study seeks to assess the performance of Support Vector Machine, Linear Regression, and Random Forest on social media. To this end, datasets containing spam and non-spam messages sourced from YouTube comment sections and Twitter will be procured. The text data will be transformed using a vectorizer to enable interpretation by machine learning models. Three models employing SVM, Linear Regression, and Random Forest will be trained and deployed to test their effectiveness. The models will be applied to detect spam messages in the test dataset, YouTube comment set, and Tweet set. The performance of the models will be evaluated based on accuracy, F1 score, and precision score. The findings indicate that the models' performance on various social media datasets is not satisfactory, as there is a significant reduction in accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call