Abstract

Fake webpages/websites are created by cyber attackers who either try to advertise their products, attempt to transmit malware to the target device, or steal victims’ login credentials. An illegitimate action of endeavoring to solicit sensitive and valuable information from user by masquerading as a truthful agent is called phishing. It involves the use of website and e-mail spoofing. A spoofed e-mail can be used to redirect a user to spoofed website which in turn can trick the user to reveal his valuable personal information. The traditional solutions for detecting spoofed or phishing websites are based on signature based methods. These methods are not able to detect the newly created spoofed websites or web pages. In order to solve this problem, researchers are coming up with machine learning methods. This paper brings out a diverse set of robust features categorized into the three categories, i.e., webpage, URL and HTML based features. The features under these categories are firstly used individually to classify webpages. Thereafter, a technique is proposed where the integration of all the features is used for classification purpose. The experimental results demonstrate that the features under URL based category are most effective in classifying the webpages. Further, there occurs a significant improvement in classification accuracy using proposed approach and random forest turns out to be the best classifier offering the accuracy of 99.5% with FPR and FNR as 0.006 and 0.005 respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.