Abstract
Emerging technologies have made internet connection a vital activity facilitating access to many services. However, internet connection raises many security concerns, such as illegally acquiring private information, passwords, and identifiers. Phishing websites are the first choice for attackers that try to have users' private space. Social engineering attacks are performed by designing fake websites similar to real ones and inviting the victim to access those websites to collect their sensitive information and then redirect them to the actual site. Due to the importance of detecting phishing websites, building a robust detector that filters them and blocks their activity on the Internet is necessary. In this paper, we proposed a phishing website detector based on improving the convolutional neural network (CNN) with a self-attention mechanism. The proposed detector collects phishing Uniform Resource Locators (URLs) by treating them as strings. CNN models have proved their efficiency when dealing with text strings compared to Long Short-Term Memory (LSTM) which focuses on temporal features. Using CNN allows learning comprehensive features of the URLs and facilitates the detection of phishing ones. The self-attention mechanism was added to enhance the model's focus and detection accuracy. Besides, the training dataset was balanced by generating phishing URLs using a Generative Adversarial Network (GAN). A set of experiments has proved the robustness of the proposed detector by achieving high detection accuracy on the test set. Besides, the proposed detector was tested using unknown URLs and achieved excellent results. The improved CNN's detection precision of 99.7 is higher than the regular CNN model by 2.74%. The reported results show that using the self-attention mechanism has improved the detection accuracy and made the CNN model more efficient for detecting phishing websites.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.