Abstract

Various phishing problems increase in cyber space with the progress of information technology. One of the prominent cyber-attacks rooted in social engineering is known as phishing. This malicious activity aims to deceive individuals into divulging sensitive information, including credit card details, login credentials, and passwords. The main importance of this research is finding the best outcome by various machine learning (ML) techniques. This paper uses a Tree Classifier (ETC), Forward Selection, Pearson correlation, Logit-LR model and Principal_Component_Analysis for feature selection. The Logistic_regression (LR), Naïve_Bayes (NB), Decision_Tree (DT), K-Nearest Neighbor (K-NN), Support_Vector_Machine (SVM), Random_Forest (RF), AdaBoost and Bagging classifiers are used for developing the phishing detection model. We have studied the model in four cases. Case 1 has 6 commonly selected features by ET, forward selection and Pearson's correlation, case 2 has 25 features by logit model, case 3 has all features, and case 4 has principal component analysis (3 and 5 components). We find the highest accuracy of 97.3% in case 2 with the random forest model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call