Natural language based malicious domain detection using machine learning and deep learning

A.S. Saleem Raja,M.S. Jayakumar,S. Mahalakshmi,G. Pradeepa

doi:10.17586/2226-1494-2023-23-2-304-312

Abstract

Cyberattacks are still challenging since they are increasing day by day. Cybercriminals employ a variety of strategies to manipulate and exploit their targets vulnerabilities. Malicious URLs are one such strategy which is used to target large groups on various social media platforms. To draw internet users, these web addresses are disguised as being safe. Deliberate or inadvertent use of such URLs exposes the user or the organization in the cyberspace and opens the way for further attacks. Systems that use rules-based or machine learning algorithms to find malicious URLs usually rely on feature engineering. This requires domain expertise and experience. Sometimes, even after extracting features from a dataset, it may not completely leverage the potential of the dataset. The proposed method employs Natural Language Processing (NLP) approaches to vectorize the words in the URLs and applies machine learning and deep learning models for classification. Vectorization technique in NLP reduces the effort of feature engineering and maximizing the use of the dataset. For the experiment, two separate datasets are used. To vectorize the URL text, three different vectorization methods are used. To evaluate the performance of the proposed method, two different datasets (D1 and D2) that are regularly utilized in the research domain were used. The results demonstrate that the superior accuracy of 92.4 % with the D1 dataset is achieved by the Decision Tree (DT) with count vectorizer and the Random Forest (RF) with Term Frequency-Inverse Document Frequency (TF-IDF) vectorizer. With the D2 dataset, DT with TF-IDF vectorizer obtains a greater accuracy of 99.5 %. The Artificial Neural Network (ANN) model achieves 89.6 % accuracy with the D1 dataset and 99.2 % accuracy with the D2 dataset.

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific and Technical Journal of Information Technologies, Mechanics and Optics	Publication Date: Apr 1, 2023
Citations: 1	License type: cc-by-nc

R Discovery Prime

R Discovery Prime

Natural language based malicious domain detection using machine learning and deep learning

Abstract

Talk to us

Similar Papers

More From: Scientific and Technical Journal of Information Technologies, Mechanics and Optics

Lead the way for us

Similar Papers

Integrating Natural Language Processing and Machine Learning Algorithms to Categorize Oncologic Response in Radiology Reports.
Po-Hao Chen ... Tessa Cook
Journal of digital imaging | VOL. 31
Po-Hao Chen, et. al.Po-Hao Chen ... Tessa Cook
27 Oct 2017
Journal of digital imaging | VOL. 31

Automated Detection of Radiology Reports that Require Follow-up Imaging Using Natural Language Processing Feature Engineering and Machine Learning Classification.
Robert Lou ... Tessa S Cook
Journal of digital imaging | VOL. 33
Robert Lou, et. al.Robert Lou ... Tessa S Cook
03 Sep 2019
Journal of digital imaging | VOL. 33

Automated Identification of Aspirin-Exacerbated Respiratory Disease Using Natural Language Processing and Machine Learning: Algorithm Development and Evaluation Study
Thanai Pongdee ... Nicholas B Larson
JMIR AI | VOL. 2
Thanai Pongdee, et. al.Thanai Pongdee ... Nicholas B Larson
12 Jun 2023
JMIR AI | VOL. 2

Resume Classification System using Natural Language Processing and Machine Learning Techniques
Irfan Ali ... Nimra Mughal
Mehran University Research Journal of Engineering and Technology | VOL. 41
Irfan Ali, et. al.Irfan Ali ... Nimra Mughal
01 Jan 2021
Mehran University Research Journal of Engineering and Technology | VOL. 41

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Natural language based malicious domain detection using machine learning and deep learning

Abstract

Talk to us

Similar Papers

More From: Scientific and Technical Journal of Information Technologies, Mechanics and Optics