Abstract
This study aimed to develop a robust machine learning-based phishing detection system using algorithms such as K-nearest neighbour (KNN), artificial neural network (ANN), and random forest (RF). It utilised datasets from Ariyadasa et al. (2021) and UNB (2016) to discern patterns distinguishing legitimate from phishing websites. Furthermore, an objective was to integrate the optimal model into a Django-based web application, facilitating real-time phishing detection. A comprehensive literature review on phishing detection techniques was also undertaken. Datasets chosen underwent rigorous pre-processing to address missing values and imbalance. Feature selection was achieved manually and automatically using mutual information classification. Three machine learning algorithms, RF, KNN, and ANN, were explored. Their hyper-parameters were optimised using GridSearchCV. Performance results highlighted RF's accuracy at 99.78%, KNN's at 99.67%, and ANN's at 99.11%. While RF and KNN models perfectly identified legitimate websites, ANN showcased an impeccable detection of phishing websites. The RF model, with the highest accuracy, was integrated into a Django application, providing a user interface for real-time phishing detection. All models exhibited high accuracy rates, demonstrating their efficacy in phishing detection. While RF was integrated into the web application for this study, the choice between models depends on specific user or business requirements and priorities. Feedback mechanisms within the Django application further promise refinement in future recommendations. The study provides a foundational step toward enhancing web safety through effective phishing detection.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal of Scientific and Management Research
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.