Abstract
DISTRIBUTION A. Approved for public releaseas of 24February2021: distribution unlimited. Cyber threats constantly evolve, making detection of novel attacks difficult. These threats and attacks are often anomalous, especially for large scale networks that generate a significant amount of normal traffic. The ratio of normal to anomalous network traffic can cause extreme dataset imbalances, making them poorly suited for traditional Machine Learning (ML) algorithms. Anomaly detection algorithms provide avenues for identifying these types of cyber threats. Evaluation of features contributing to the identification of anomalous cyber events is critical in determining the type of cyber threat that the anomalous event may represent. We evaluate feature contribution using two well-known anomaly detection algorithms, Local Outlier Factor (LOF) and Isolation Forest (IF). Each of these unsupervised learning algorithms identifies anomalous data differently; where LOF uses regional density methods, IF uses decision trees. We evaluated a Euclidean-based measurement for LOF and a localized Depth-based Isolation Forest Feature Importance (DIFFI) method for IF to extract local feature importance. We demonstrate the extent to which feature evaluation methods are dependent on the anomaly detection algorithm used, and provide insight to features important for anomaly detection in cyber data using synthetic and real world data. Finally, we discuss the integration of anomaly scoring explainability into a novel cyber platform called the Pacific Ecosystem for Cyber (PEcoC, pronounced “peacock”). PEcoC leverages local explainability to accelerate analyst response, increase the body of knowledge available to analysts, and to lower barriers of entry for trainees and junior analysts. Ultimately, integrating local explainability of anomaly detection provides useful and important context for expert cyber analysts responsible for network defense. Anomaly detection within PEcoC complements existing defensive-cyber TTPs, signature-based detection systems, and augments analysts’ workflows and analytical options. DISTRIBUTION A. Approved for public releaseas of 24February2021: distribution unlimited.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.