Abstract

Email has continued to be an integral part of our lives and as a means for successful communication on the internet. The problem of spam mails occupying a huge amount of space and bandwidth, and the weaknesses of spam filtering techniques which includes misclassification of genuine emails as spam (false positives) are a growing challenge to the internet world. This research work proposed the use of a metaheuristic optimization algorithm, the whale optimization algorithm (WOA), for the selection of salient features in the email corpus and rotation forest algorithm for classifying emails as spam and non-spam. The entire datasets were used, and the evaluation of the rotation forest algorithm was done before and after feature selection with WOA. The results obtained showed that the rotation forest algorithm after feature selection with WOA was able to classify the emails into spam and non-spam with a performance accuracy of 99.9% and a low FP rate of 0.0019. This shows that the proposed method had produced a remarkable improvement as compared with some previous methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call