Abstract

Detection of spam and non-spam emails is considered a great challenge for email service providers and users alike. Spam emails waste the Internet traffic and also contain malicious links that mostly direct users to phishing webpages. Another challenge of spams is their role in spreading malware on the network, further emphasizing the need for their detection. Despite the application of data mining methods such as artificial neural networks (ANNs) in spam detection, these methods are prone to a significant error in their output mostly due to including all the spam features in their training stage. To reduce the spam detection error, a feature selection-based method was provided in this paper using the sine–cosine algorithm (SCA). In the proposed method, feature vectors are updated by the SCA to select the optimal features for training the ANN. Implementation of the proposed method of the Spambase dataset in MATLAB indicated a precision, accuracy and sensitivity of 98.64%, 97.92% and 98.36%, respectively. In other words, the proposed method outperformed the multilayer perceptron (MLP) neural network, Bayesian network, decision tree and random forest classifiers in terms of spam detection. According to the test results, the feature selection error in the MLP neural network decreased by approximately 2.18% using the SCA.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.