Explaining IoT Attacks: An Effective and Efficient Semi-Supervised Learning Framework

Giuseppe Cascavilla,Reinier Zwart,Damian A Tamburri,Alfredo Cuzzocrea

doi:10.1109/bigdata55660.2022.10020894

Abstract

Cyber-attacks targeting Internet-of-Things (IoT) devices are prevalent due to the limited security resources of the target devices and their often limited connectivity. Explaining such attacks is therefore greatly important to construct countermeasures. Current methods of automated IoT attack analysis require either large amounts of labelled data for classification, or use clustering methods which can be inaccurate. However, when a desired grouping of the data, as well as some prior knowledge about some observations in the data is available, approximate semi-supervised learning methods may be used to create accurate cluster arrangements. We therefore investigated the use of semi-supervised clustering approaches for creating accurate clusters of IoT attack sessions based on their goals and characteristic commonalities. We first manually created a ground-truth grouping of recent IoT attacks based on their goal. We differentiated the goal of each session according to the purpose of the used commands and the taken approach, resulting in a total of five classes. We then automatically constructed a feature set suitable for clustering similar IoT attack sessions using a method proposed in recent literature, and passed it to two different semi-supervised clustering algorithms using either labelled data (SeededKMeans) or pairwise constraints (PCKMeans) as prior knowledge. We found that both semi-supervised approaches were able to create accurate cluster arrangements using only small amounts of prior knowledge. Moreover, they outperformed an entirely unsupervised KMeans algorithm in terms of accuracy.

Full Text