Abstract

An inherent part of everyday life and work on a computer is ownership and use of an email address. The main aim of this paper is to analyze existing approaches to classification of malicious emails. We have implemented a system, which is able to distinguish between legitimate and malicious emails. Subsequently, malicious emails are classified into three subcategories: spam, scam, and phishing. We prepared a labeled dataset. We extracted several features from emails contained in the dataset. Within the system, we have implemented four supervised machine learning methods (Random Forest, Decision Tree, Support Vector Machines, k-Nearest Neighbors) and evaluated them. According to our results, the Random Forest is the most suitable approach for email classification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call