Abstract

Internet-based technology has become a primary need. Based on the survey results from the Central Statistics Agency in collaboration with APJII, email sending and receiving activities have outperformed social media positions by reaching 95.75%. Very intense use of email can have both positive and negative effects. Because apart from being a communication tool, in reality not everyone uses email well and there are even so many misuses of email that have the potential to harm others. This misused email is commonly known as spam or junkmail (junk email) which contains advertisements, scams and even viruses. In this study, data processing from gmail emails with text mining was carried out and then tested with several data mining classification methods including the Naïve Bayes Algorithm, SVM, Random Forest and combined with Partical Swarm Optimization in predicting spam emails with the aim that the selected algorithm is the most accurate. From the test results by measuring the performance of the four algorithms using Confusion Matrix and ROC, it is known that the Naïve Bayes algorithm with Partical Swarm Optimization (PSO) has the highest accuracy value, namely 81.40% and AUC 0.78

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call