Spam emails, also known as non-self, are unsolicited commercial or malicious emails, sent to affect either a single individual or a corporation or a group of people. Besides advertising, these may contain links to phishing or malware hosting websites set up to steal confidential information. In this paper, a study of the effectiveness of using a Negative Selection Algorithm (NSA) for anomaly detection applied to spam filtering is presented. NSA has a high performance and a low false detection rate. The designed framework intelligently works through three detection phases to finally determine an email’s legitimacy based on the knowledge gathered in the training phase. The system operates by elimination through Negative Selection similar to the functionality of T-cells’ in biological systems. It has been observed that with the inclusion of more datasets, the performance continues to improve, resulting in a 6% increase of True Positive and True Negative detection rate while achieving an actual detection rate of spam and ham of 98.5%. The model has been further compared against similar studies, and the result shows that the proposed system results in an increase of 2 to 15% in the correct detection rate of spam and ham.
Read full abstract