Recent years have witnessed an exponential rise in wireless networks and allied interoperable distributed computing frameworks, where the different sensory units transfer real-world event data to the network analyzer for run-time decisions. There exists an array of applications employing edge- internet of things (Edge-IoT) where the edge nodes collect local data to perform real-time decisions. However, the at-hand edge-IoT systems being decentralized, infrastructure-less, and dynamic remain vulnerable to man-in-the-middle attacks, intrusion, denial of service attacks, etc. Though in the past, numerous efforts were made towards intrusion detection in IoT networks, the major approaches focused merely on standalone intrusion detection, and therefore their scalability towards multiple attack detection remains unaddressed. On the contrary, applying a unit intrusion detection system for each type of attack can impose resource exhaustion and delay. Recently authors have used deep learning methods like convolutional neural network (CNN), and long- and short-term memory (LSTM) to perform learning-based intrusion detection. However, being reliant on merely local features its reliability remains suspicious. Such methods ignore long-term dependency problems that limit their efficacy in intrusion detection in temporal Edge-IoT network traffic. With this motivation, in this paper, a contextual deep semantic feature-driven multi-type intrusion detection model (CDS-MNIDS) is proposed for Edge-IoT networks. The proposed CDS-MNIDS model at first performs network traffic segmentation from the temporal network traces obtained from the network gateway. Subsequently, the node’s dynamic features including the node’s address, packet size, transmission behavior, etc., are processed for Word2Vec encoding, followed by a cascaded deep network-based learning and prediction. The CDS-MNIDS model embodied a cascaded deep network encompassing LSTM and bidirectional LSTM networks, where the first extracted local features. At the same time, the latter obtained contextual features from the input local feature vector. The extracted local and contextual features were projected to the global average pooling layer followed by the fully connected layer that in conjunction with the Softmax layer performed multi-class classification.
Read full abstract