Abstract

The web technology has become the cornerstone of a wide range of platforms, such as mobile services and smart Internet-of-things (IoT) systems. In such platforms, users’ data are aggregated to a cloud-based platform, where web applications are used as a key interface to access and configure user data. Securing the web interface requires solutions to deal with threats from both technical vulnerabilities and social factors. Phishing attacks are one of the most commonly exploited vectors in social engineering attacks. The attackers use web pages visually mimicking legitimate web sites, such as banking and government services, to collect users’ sensitive information. Existing phishing defense mechanisms based on URLs or page contents are often evaded by attackers. Recent research has demonstrated that visual layout similarity can be used as a robust basis to detect phishing attacks. In particular, features extracted from CSS layout files can be used to measure page similarity. However, it needs human expertise in specifying how to measure page similarity based on such features. In this paper, we aim to enable automated page-layout-based phishing detection techniques using machine learning techniques. We propose a learning-based aggregation analysis mechanism to decide page layout similarity, which is used to detect phishing pages. We prototype our solution and evaluate four popular machine learning classifiers on their accuracy and the factors affecting their results.

Highlights

  • The web technology has become the cornerstone of a wide range of platforms, such as mobile services and smart Internet-of-things (IoT) systems

  • All the classifiers show more than 93% accuracy and more than 84% F1, which demonstrates that our approach can make an effective detection in phishing websites

  • With respect to other approaches, our method is light-weight as it only takes one class of features, Cascading Style Sheets (CSS) structure, as the input to identify the similarity of web pages and detect phishing attacks

Read more

Summary

Introduction

The web technology has become the cornerstone of a wide range of platforms, such as mobile services and smart Internet-of-things (IoT) systems. Features extracted from CSS layout files are used to measure page similarity These measurements heavily rely on human experiences and may not be comprehensive to detect new attacks. In our previous work [8, 9], we have demonstrated that CSS-based page layout features can be used as the basis to detect phishing pages, where we convert CSS into a normalized representation called influence vector. It consists of two parts: a property, and one or more declarations. The selectors can be classified into four categories tag, ID, class, and others

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.