Abstract

Introduction: The entire world is shifting towards electronic communication through Email for fast and secure communication. Millions of people, including organization, government, and others, are using Email services. This growing number of Email users are facing problems; therefore, detecting phishing Email is a challenging task, especially for non-IT users. Automatic detection of phishing Email is essential to deploy along with Email software. Various authors have worked in the field of phishing Email classification with different feature selection and optimization techniques for better performance. Objective: This paper attempts to build a model for the detection of phishing Email using data mining techniques. This paper's significant contribution is to develop and apply Feature Selection Technique (FST) to reduce features from the phishing Email benchmark data set. Methods: The proposed Pruning Based Feature Selection Technique (PBFST) is used to determine the rank of feature based on the level of the tree where feature exists. The proposed algorithm is integrated with already developed Bucket Based Feature Selection Technique (BBFST). BBFST is used as an internal part to rank features in a particular level of the tree. Results : Experimental work was carried out with open source WEKA data mining software using a 10-fold cross-validation technique. The proposed FST was compared with other ranking based FSTs to check the performance of C4.5 classifier with Phishing Email data set. Conclusion: The proposed FST reduces 33 features out of 47 features which exist in phishing Email dataset and C4.5 algorithm produces remarkable accuracy of 99.06% with only 11 features and it has been found to be better than other existing FSTs.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.