Abstract
This research investigates the email features that make a phishing email difficult to detect by humans. We use an existing data set of phishing and ham emails and expand that data set by collecting annotations of the features that make the emails phishing. Using the new, annotated data set, we perform cluster analyses to identify the categories of emails and their attributes. We then analyze the accuracy of detection in each category. Our results indicate that the similarity of the features of phishing emails to benign emails, play a critical role in the accuracy of detection. The phishing emails that are most similar to ham emails had the lowest accuracy while the phishing emails that were most dissimilar to the ham emails were detected more accurately. Regression models reveal the contribution of phishing email’s features to phishing detection accuracy. We discuss the implications of these results to theory and practice.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have