An Approach for Efficient and Accurate Phishing Website Prediction Using Improved ML Classifier Performance for Feature Selection

Anjaneya Awasthi,Noopur Goel

doi:10.52756/ijerr.2024.v40spl.006

Abstract

The article discusses the use of machine learning (ML) to combat phishing websites, which are deceptive sites that mimic trusted entities to steal sensitive information. This is why the continued invention of methods of identifying and counteracting phishing threats is beneficial. Such attacks pose significant risks to the integrity of online security. To enhance the success rate and specificity of predicting phishing websites, this study proposes a new approach that utilizes machine learning algorithms. To enhance the methods mentioned above and achieve better results in classification and better prediction of customer behaviour, the main points exposed to further transformations are increasing classifier accuracy and selecting an optimal feature space. Traditional anti-phishing strategies like blacklisting and heuristic searches often have slow detection times and high false positive rates. The article introduces a novel feature selection method to extract highly correlated features from datasets, thereby enhancing classifier accuracy. Using six feature selection techniques on a phishing dataset, it evaluates eight classifiers, including SVM, Logistic Regression, Random Forest, and others. The study finds that the Random Forest classifier combined with the Chi-2 feature selection method significantly improves model accuracy, achieving up to 96.99%.

Full Text