Abstract
Recent studies have highlighted that insider threats are more destructive than external network threats. Despite many research studies on this, the spatial heterogeneity and sample imbalance of input features still limit the effectiveness of existing machine learning-based detection methods. To solve this problem, we proposed a supervised insider threat detection method based on ensemble learning and self-supervised learning. Moreover, we propose an entity representation method based on TF-IDF to improve the detection effect. Experimental results show that the proposed method can effectively detect malicious sessions in CERT4.2 and CERT6.2 datasets, where the AUCs are 99.2% and 95.3% in the best case.
Highlights
With the rapid development of information security, threats from inside of systems, which are called insider threats, have received more and more attention
Traditional insider threat detection methods are mostly based on user behavior features extracted artificially; their effects often depend on the effectiveness of the extracted features
These methods are difficult to model user behaviors in long time series. ese factors limit the detection effect of insider threats. erefore, part of the recent works has turned to temporal models, especially those based on deep learning, for insider threat detection [2, 3]. ey model user’s behavior in a period and get the feature representation of behavior sequences automatically
Summary
With the rapid development of information security, threats from inside of systems, which are called insider threats, have received more and more attention. Compared with threats from external systems, insider threat can be more harmful to companies and military organizations, as insider employees may take advantage of the loopholes in system implementation and business process to conduct behaviors that threaten system security with ease, including confidential information theft, commercial fraud, or system destruction [1]. Traditional insider threat detection methods are mostly based on user behavior features extracted artificially; their effects often depend on the effectiveness of the extracted features. These methods are difficult to model user behaviors in long time series. Ey model user’s behavior in a period and get the feature representation of behavior sequences automatically On this basis, outlier analysis is carried out to identify malicious insiders
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have