An Automated System to Detect Phishing URL by Using Machine Learning Algorithm

Deepa Parasar,Yogesh H Jadhav

doi:10.1007/978-3-030-49795-8_21

Abstract

Malicious URLs play a very important role in today’s critical scam and attacks. They are harmful to every aspect of the usage of computers. Identification and detection of these malicious URL are very crucial. Malignant codes are synchronized with malicious software by invaders or hackers. Malicious content can be like Trojan horses, worms, backdoors, etc.; detection of these URLs is done previously by the usage of blacklists and whitelists. Blacklist itself cannot be sufficient to check the malicious URLs because they suffer from a shortage in the capacity in terms of newly created malicious URLs. These conventional approaches shortfalls by effectively dealing with evolving technologies and web searching mechanisms. In recent years, systems have been explored and evolved with the increasing research attention on enhancing the ability to detect malicious URLs. In this research paper, an innovative classification method was proposed to solve the difficulties encountered in malicious URL detection by using the existing mechanisms. The proposed classification model is based on high-performance machine learning methods which not only takes the syntactic essence of the URL into consideration but also the semantic and lexical meaning of these dynamically changing URLs. It is expected that the proposed approach will overcome the drawbacks of the existing techniques. A comparative analysis of Logistic regression, Support Vector Machine, and Naive Bayes classification has also been performed. The tests of computer simulation have developed SVM with greater accuracy than logistic regression and Naive Bayes. Support Vector Machine has been obtained with an accuracy of 85.35%.

Full Text