Abstract

The traditional malicious uniform resource locator (URL) detection method excessively relies on the matching rules formulated by the network security personnel, which is hard to fully express the text information of the URL. Thus, an improved multilayer recurrent convolutional neural network model based on the YOLO algorithm is proposed to detect malicious URL in this paper. First, single characters are mapped to dense vectors using word embedding, and the dense vectors are participated in the training process of the whole model according to the structural characteristics of the URL in the method. Then, the CSPDarknet neural network model based on the improved YOLO algorithm is proposed to extract features of the URL. Finally, the extracted features are used to evaluate malicious URL by the bidirectional LSTM recurrent neural network algorithm. In order to verify the validity of the algorithm, a total of 200,000 URLs are collected, including 100,000 normal URLs labeled “good” and 100,000 malicious URLs labeled “bad”. The experimental results show that the method detects malicious URLs more quickly and effectively and has high accuracy, high recall rate, and high accuracy compared with Text-RCNN, BRNN, and other models.

Highlights

  • With the rapid development of Internet technology, network crime is becoming more and more serious, which brings heavy losses for personal network privacy and property security [1]

  • If we take the accuracy increased to 94% as the standard, the improved model based on CSPDarknet has been completed in the 10th iteration, and the improved model based on Darknet is completed after the 17th iteration. erefore, the improved model based on CSPDarknet has a faster convergence rate, and the accuracy is slightly higher than the improved model based on Darknet. e traditional bidirectional recurrent neural network model, the RCNN model, and the network model based on the full connection layer exist overfitting phenomena, and the severity of overfitting increases sequentially

  • Malicious uniform resource locator (URL) identification and detection are two of the important maintenance methods in maintaining network information security. e traditional malicious URL detection methods unduly rely on similarity matching rules, and the information of URL text is lost after numerical vectorization, which together makes it difficult to identify the context relationship of URL, and there are misjudgments and omissions

Read more

Summary

Introduction

With the rapid development of Internet technology, network crime is becoming more and more serious, which brings heavy losses for personal network privacy and property security [1]. E method classifies the input URL into benign, unknown, and malicious and uses the cost matrix to select the most relevant features and to control the model misclassification. It can effectively reduce the false detection rate of the malicious URL. A malicious URL detection model based on the combination of multilayer convolutional neural network and bidirectional recurrent neural network is proposed in the paper. The results of feature extraction are input into the bidirectional recurrent neural network, and the network detects malicious URLs in positive and negative directions. The one-hot word vector associates each character with a unique integer index i and converts this integer index into a binary character vector with length M

Character frequency
Xt Forgetting gate Input gate
Hidden layer
ResUnit n
Accuracy Loss value
Loss value
Model type
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call