Spam filtering with abductive networks

El-Sayed M El-Alfy,Radwan E Abdel-Aal

doi:10.1109/ijcnn.2008.4633784

Abstract

Spam messages pose a major threat to the usability of electronic mail. Spam wastes time and money for network users and administrators, consumes network bandwidth and storage space, and slows down email servers. In addition, it provides a medium to distribute harmful code and/or offensive content. In this paper, we investigate the application of abductive learning in filtering out spam messages. We study the performance for various network models on the spambase dataset. Results reveal that classification accuracies of 91.7% can be achieved using only 10 out of the available 57 content attributes. The attributes are selected automatically by the abductive learning algorithm as the most effective feature subset, thus achieving approximately 6:1 data reduction. Comparison with other techniques such as multi-layer perceptrons and naive Bayesian classifiers show that the abductive learning approach can provide better spam detection accuracies, e.g. false positive rates as low as 5.9% while requiring much shorter training times.

Full Text