Abstract
An inherent part of everyday life and work on a computer is ownership and use of an email address. The main aim of this paper is to analyze existing approaches to classification of malicious emails. We have implemented a system, which is able to distinguish between legitimate and malicious emails. Subsequently, malicious emails are classified into three subcategories: spam, scam, and phishing. We prepared a labeled dataset. We extracted several features from emails contained in the dataset. Within the system, we have implemented four supervised machine learning methods (Random Forest, Decision Tree, Support Vector Machines, k-Nearest Neighbors) and evaluated them. According to our results, the Random Forest is the most suitable approach for email classification.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have