Abstract

In recent years, Internet of things (IoT) devices are playing an important role in business, education, medical as well as in other fields. Devices connected to the Internet is much more than the number of world population. However, it may face all kinds of attacks from the Internet easily for its accessibility. As we all know, most attacks against IoT devices are based on Web applications. So protecting the security of Web services can effectively improve the situation of IoT ecosystem. Conventional Web attack detection methods highly rely on samples, and artificial intelligence detection results are uninterpretable. Hence, this article introduced a supervised detection algorithm based on benign samples. Seq2Seq algorithm is been chosen and applied to detect malicious web requests. Meanwhile, the attention mechanism is introduced to label the attack payload and highlight labeling abnormal characters. The results of experiments show that on the premise of training a benign sample, the precision of proposed model is 97.02%, and the recall is 97.60%. It explains that the model can detect Web attack requests effectively. Simultaneously, the model can label attack payload visually and make the model “interpretable.”

Highlights

  • Today portable devices are playing an important role in business, education, and medicine as well as in other fields

  • Through the observation of the data sets, we could conclude that the data of benign samples accords with expectations while mislabeled many malicious samples, and we found that the length of most wrongly labeled data requests is less than 20 bits

  • This time, the semantic vector Ct represents the weight of the current input compared to the output of the model, which is similar to the human attention mechanism

Read more

Summary

Introduction

Today portable devices are playing an important role in business, education, and medicine as well as in other fields. 10 machine-learning algorithms, including RF (Random Forest),[16] ADTree (Alternating Decision tree),[17] SVM, LR (Logistic Regression)[18] and so on, will testify and identify whether the eigenvectors are attacks This method compares each algorithm and obtains the best algorithm model, but it has limitation on scalability and human factors which greatly affect the detection results, as it requires artificial maintenance and obtaining large numbers of eigenvectors artificially. The sequence vector was generated based on American Standard Code for Information Interchange (ASCII).[20] CNN (convolutional neural networks)[21] and LSTM (long short-term memory)[22] trained sequence vectors to build a model and classify sample data This method obtained rather good results with a 98.2% detection rate and 97.84% recall rate. There are still some problems that need resolving: 1. The lack of label data: There are numerous normal request samples while there are few variegated attack samples in real environment, which causes obstacles to the model’s learning and training

The lack of sample classes
The interpretability of the results
Encoder
Decoder
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call