Abstract
AbstractIn this article, a hybrid algorithm has been proposed for the identification of phishing and legitimate websites. The dataset may have an imbalanced class distribution and may consist of irrelevant features. Therefore, in the data preprocessing, the adaptive synthetic sampling approach has been used to handle the imbalanced data. Irrelevant or redundant features are removed from the balanced data using the proposed binary version of Rao algorithms. The S‐shaped and V‐shaped transfer functions are applied for mapping continuous search space to discrete search space. Also, the results of these S‐shaped and V‐shaped transfer functions are analyzed for proposed algorithms. The performance is improved by optimizing the value of the k parameter in the kNN classifier. The dataset used in this article has been taken from the UCI machine‐learning repository. The performance of the proposed approach has been evaluated using the polygon area metric. The obtained classification accuracy is 97.044%. A comparison of the proposed hybrid algorithm with the other state‐of‐the‐art techniques is also made for validation. Moreover, the proposed approach has been compared with seven metaheuristic feature selection algorithms and six filter methods for performance analysis. Additionally, we have applied the proposed approach to URLs that are registered on the PhishTank website.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.