Abstract

Spam Email is a serious concern which can steal user’s personal information and cause huge financial loss due to the increasing rate of internet users. Therefore, the demand for accurate spam filtering has become more sophisticated for the Email spam detection. In the existing techniques, it is difficult to intricate the relationship between words in the Email using certain word embedding techniques and learning rate tuning is one of the greatest challenges of stochastic optimization. To overcome this difficulty, the proposed framework uses diverse ensemble based Email spam classification by incorporating multiple word embedding’s with Continuous Coin Betting optimizer. Word2Vec is used to produce the first set of 200D, next set of 200D word embedding is produced by Glove and 768D is produced by using Bidirectional Encoder Representations from Transformers (BERT) respectively. After generating word embedding, then it is classified through diverse ensemble based classifier with base level classifier consists of Long Short Term Memory (LSTM) Networks, Gated Recurrent Unit (GRU) and Bi-directional Gated Recurrent Unit (Bi-GRU) and LSTM as Meta-classifier using COCOB optimizer. Experiments were conducted on 3 benchmark Email dataset and result shows that the proposed system outperforms well with a low false positive rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call