Despite the affordability, simplicity, and efficiency of controller area network (CAN) protocols, the security vulnerability remains a major challenge. Currently, a machine learning-based intrusion detection system (IDS) is considered an effective approach for improving security in CAN by identifying malicious attacks. However, earlier studies that relied on supervised learning methods required considerable amounts of labeled data. Data collection from vehicles is time-consuming and expensive. Furthermore, the obtained data exhibited a class imbalance, which presents further challenges in the analysis and model training. Thus, we propose a semi-supervised learning-based IDS that combines variational autoencoder (VAE) and adversarial reinforcement learning for the multi-class classification of both known and unknown attacks. The proposed system capitalizes on the diverse patterns inherent in unlabeled data, transforming this data space into one that is more conducive to classification. Concurrently, adversarial agents in the reinforcement learning algorithm interact competitively, progressively enhancing their ability to intelligently classify and select samples. To reduce the reliance on labeled data and effectively exploit them, we utilize a pseudo-labeling process for pre-training. Experimental results indicate that the proposed model achieves more effective classification while requiring less labeled data compared to other baseline models for known attacks. By inheriting the advantages of VAE, promising results demonstrate that the proposed system detects unknown attacks containing similar or completely different characteristics with high F1 scores exceeding 0.9 and 0.84, respectively. Finally, the proposed system was demonstrated to be a lightweight model for the expeditious detection of malevolent messages introduced into in-vehicle networks to ensure minimal latency.
Read full abstract