Spam mail classification using back propagation neural networks

Jingfeng Chen

doi:10.54254/2755-2721/5/20230617

Abstract

Mail classification methods based on machine learning have been introduced to combat spams. However, few researches focus on the most powerful machine learning model that is neural networks. In this paper, the author trains BP neural networks to detect spams. The inputs of the neural networks are only information about words, punctures, signs, numbers and illegal words. Five neural networks which are different in number of neurons and number of layers are experimented on. All networks apply Rectified Linear Unit (ReLU) functions and Momentum learning technology. The results show that the network with four hidden layers enjoys the best classifying accuracy of 97.0%. In networks with two hidden layers, when the number of neurons in each layer is above 300, the accuracy is between 95.5% and 96.0%; and 100 neurons in each layer result in an accuracy of 93.8%. Although the training only captures information of words, punctures and signs, the networks have achieved high accuracy, and the author suggests that making the computer understand sentences as well as other kinds of improvements can lead to even higher performance.

Full Text