Abstract

As an industrial infrastructure, the safety and reliability of the Cyber-Physical System requires the effective anomaly detection. However, the existing detection methods have bottleneck in the face of insufficient training datasets. This work proposed and a novel anomaly detection approach based on ensemble semi-supervised active learning, which can effectively detect anomalous traffic when there is few labeled samples and the dataset is unbalanced. Specifically, this work proposed balanced sampling strategy, which combines the margin sampling and the democratic co-learning techniques, to construct a balanced training set that consists of manually labeled high-information samples and automatically labeled high-confidence samples, to effectively train the detection model on a limited budget. We also found adding correctly labeled high-confidence samples into training set improves the performance of detection model when the training samples are few and the label budget is limited. This work achieves a good balance between the effectiveness of model training and the cost of sample querying when the traffic data in CPS is rare labeled and imbalanced. In addition, we designed five pairs of experiments with NSL-KDD and SWaT dataset, and the results demonstrate the capability and advancement of proposed approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call