Abstract

Insider threat detection has drawn increasing attention in recent years. In order to capture a malicious insider’s digital footprints that occur scatteredly across a wide range of audit data sources over a long period of time, existing approaches often leverage a scoring mechanism to orchestrate alerts generated from multiple sub-detectors, or require domain knowledge-based feature engineering to conduct a one-off analysis across multiple types of data. These approaches result in a high deployment complexity and incur additional costs for engaging security experts. In this paper, we present a novel approach that works with a variety of security logs. The security logs are transformed into texts in the same format and then arranged as a corpus. Using the model trained by Word2vec with the corpus, we are enabled to approximate the posterior probabilities for insider behaviours. Accordingly, we label the transformed events as suspicious if their behavioural probabilities are smaller than a given threshold, and a user is labelled as malicious if he/she is associated with multiple suspicious events. The experiments are undertaken with the Carnegie Mellon University (CMU) CERT Programs insider threat database v6.2, which not only demonstrate that the proposed approach is effective and scalable in practical applications but also provide a guidance for tuning the parameters and thresholds.

Highlights

  • Malicious insiders have been recognised as the most critical security threat to an organisation [1], [2]

  • To overcome the aforementioned limitations, we present a novel approach that realises behavioural analysis based insider threat detection using a corpus transformed from various security logs

  • We propose a new approach to deal with insider threats, which reconstructs semantic properties from multiple types of security logs and detects insider threats from a behaviour analysis’s perspective

Read more

Summary

INTRODUCTION

Malicious insiders have been recognised as the most critical security threat to an organisation [1], [2]. The other category applies machine learning algorithms to work on the features extracted from all relevant audit data, flags those that significantly deviate from the rest as suspicious [8] These approaches, suffer from one or more following limitations: (1) Individual sub-detectors fail to indicate the presence of an insider with both high confidence and a low false positive rate (FPR). We obtain the likelihood of a particular behaviour to be suspicious using the similarities between words by querying the Word2vec model trained with the corpus generated from multiple types of security logs Based on such likelihoods, we are able to detect insiders who behave unusually.

RELATED WORK
MALICIOUS INSIDER DETECTION VIA WORD EMBEDDING
MALICIOUS INSIDER DETECTION USING BEHAVIOURAL PROBABILITIES
EXPERIMENTS AND RESULTS
EXPERIMENTAL SETTING
PERFORMANCE RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call