Developing Cross-Domain Host-Based Intrusion Detection

Oluwagbemiga Ajayi,Aryya Gangopadhyay,Carl Bursat,Robert F. Erbacher

doi:10.3390/electronics11213631

Abstract

Digital transformation has continued to have a remarkable impact on industries, creating new possibilities and improving the performance of existing ones. Recently, we have seen more deployments of cyber-physical systems and the Internet of Things (IoT) as in no other time. However, cybersecurity is often an afterthought in the design and implementation of many systems; therefore, there usually is an introduction of new attack surfaces as new systems and applications are being deployed. Machine learning has been helpful in creating intrusion detection models, but it is impractical to create attack detection models with acceptable performance for every single computing infrastructure and the various attack scenarios due to the cost of collecting quality labeled data and training models. Hence, there is a need to develop models that can take advantage of knowledge available in a high resource source domain to improve performance of a low resource target domain model. In this work, we propose a novel cross-domain deep learning-based approach for attack detection in Host-based Intrusion Detection Systems (HIDS). Specifically, we developed a method for candidate source domain selection from among a group of potential source domains by computing the similarity score a target domain records when paired with a potential source domain. Then, using different word embedding space combination techniques and transfer learning approach, we leverage the knowledge from a well performing source domain model to improve the performance of a similar model in the target domain. To evaluate our proposed approach, we used Leipzig Intrusion Detection Dataset (LID-DS), a HIDS dataset recorded on a modern operating system that consists of different attack scenarios. Our proposed cross-domain approach recorded significant improvement in the target domains when compared with the results from in-domain approach experiments. Based on the result, the F2-score of the target domain CWE-307 improved from 80% in the in-domain approach to 87% in the cross-domain approach while the target domain CVE-2014-0160 improved from 13% to 85%.

Full Text