Spam/Ham email classification using BERT

Siwei Zhang

doi:10.54254/2755-2721/6/20230571

Abstract

Email is a popular method for communicating with each other. However, as sending email is free of charge as long as an email server and a domain name are available, spam mail is becoming a critical problem in the email network. Conventionally, the industry uses spam filters based on rules and Bayesian inference to counteract spam mail, reaching an accuracy of 98.76%, which is far from satisfactory. Hence, to better protect email users from unsolicited messages containing advertisements, sensitive content, phishing content, and viruses, a new approach is proposed, in which email content is filtered by a spam detector using bidirectional encoder representations from transformers (BERT). BERT is a new language representation model published by Google that has achieved great success because of its powerful capabilities in understanding natural language. After the model is trained on a corpus from Kaggle, the spam detector equipped with the BERT model reaches a binary accuracy of 99.40% when classifying spam mail.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Spam/Ham email classification using BERT

Abstract

Talk to us

Similar Papers

More From: Applied and Computational Engineering

Lead the way for us

Journal: Applied and Computational Engineering	Publication Date: Jun 14, 2023
License type: cc-by

Similar Papers

Bidirectional encoders to state-of-the-art: a review of BERT and its transformative impact on natural language processing
Rajesh Gupta
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3
Rajesh GuptaRajesh Gupta
02 Mar 2024
Информатика. Экономика. Управление - Informatics. Economics. Management | VOL. 3

Oversampling effect in pretraining for bidirectional encoder representations from transformers (BERT) to localize medical BERT and enhance biomedical BERT
Shoya Wada ... Yasushi Matsumura
Artificial Intelligence In Medicine | VOL. 153
Shoya Wada, et. al.Shoya Wada ... Yasushi Matsumura
05 May 2024
Artificial Intelligence In Medicine | VOL. 153

Engineering Document Summarization Using Sentence Representations Generated by Bidirectional Language Model
Yan Jin ... Yunjian Qiu
-
Yan Jin, et. al.Yan Jin ... Yunjian Qiu
17 Aug 2021
17 Aug 2021

Bert model fine-tuning for text classification in knee OA radiology reports
L Chen ... V Pedoia
Osteoarthritis and Cartilage | VOL. 28
L Chen, et. al.L Chen ... V Pedoia
01 Apr 2020
Osteoarthritis and Cartilage | VOL. 28

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Spam/Ham email classification using BERT

Abstract

Talk to us

Similar Papers

More From: Applied and Computational Engineering