Deep Q-network-based heuristic intrusion detection against edge-based SIoT zero-day attacks

Shigen Shen,Chenpeng Cai,Zhenwei Li,Yizhou Shen,Guowen Wu,Shui Yu

doi:10.1016/j.asoc.2023.111080

Abstract

How to process and classify zero-day attacks due to their huge damage to social Internet of Things (SIoT) systems has become a hot research issue. To solve this issue, we propose a heuristic learning intrusion detection system with Deep Q-Networks (DQN) for edge-based SIoT networks under the scenario of insufficient training samples, which is named DQN-HIDS. It is composed of an SIoT network traffic processing module and a DQN-based heuristic learning network. The SIoT network traffic processing module generates SIoT traffic samples, selects samples entering a classifier and a cybersecurity examiner center, and outputs similarity. We integrate DQN into a heuristic learning network to gradually improve its ability to identify malicious traffic. Specially, reward functions are designed according to the selected actions of the network, in order to punish the behavior of incorrectly labeling malicious samples and make variable reward functions adapt to different execution actions. The LSTM-based DQN then maximizes the cumulative expected reward to find the optimal strategy for the heuristic learning network. Consequently, DQN-HIDS gradually improves the behavior frequency of its labeling, reduces resource workloads, and increases the ability to label SIoT network traffic. Experiments show the performance of DQN-HIDS in terms of the workload of the examiner center and the queue workload of delayed samples, the rewards obtained by the DQN-based heuristic learning network, and the accuracy of the classifier. Comparisons with a state-of-the-art deep learning model and typical machine learning methods are also made, demonstrating the advantages of DQN-HIDS with fewer SIoT network traffic samples.

Full Text