A machine learning-based FinTech cyber threat attribution framework using high-level indicators of compromise

Umara Noor,Zahid Anwar,Tehmina Amjad,Kim-Kwang Raymond Choo

doi:10.1016/j.future.2019.02.013

Umara Noor, Zahid Anwar + Show 2 more

https://doi.org/10.1016/j.future.2019.02.013

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Cyber threat attribution identifies the source of a malicious cyber activity, which in turn informs cyber security mitigation responses and strategies. Such responses and strategies are crucial for deterring future attacks, particularly in the financial and critical infrastructure sectors. However, existing approaches generally rely on manual analysis of attack indicators obtained through approaches such as trace-back, firewalls, intrusion detection and honeypot deployments. These attack indicators, also known as low-level Indicators of Compromise (IOCs), are rarely re-used and can be easily modified and disguised resulting in a deceptive and biased cyber threat attribution. Cyber attackers, particularly financially-motivated actors, can use common high-level attack patterns that evolve less frequently as compared to the low-level IOCs. To attribute cyber threats effectively, it is necessary to identify them based on the high-level adversary’s attack patterns (e.g. tactics, techniques and procedures - TTPs, software tools and malware) employed in different phases of the cyber kill chain. Identification of high-level attack patterns is time-consuming, requiring forensic investigation of the victim network(s) and other resources. In the rare case that attack patterns are reported in cyber threat intelligence (CTI) reports, the format is textual and unstructured typically taking the form of lengthy incident reports prepared for human consumption (e.g. prepared for C-level and senior management executives), which cannot be directly interpreted by machines. Thus, in this paper we propose a framework to automate cyber threat attribution. Specifically, we profile cyber threat actors (CTAs) based on their attack patterns extracted from CTI reports, using the distributional semantics technique of Natural Language Processing. Using these profiles, we train and test five machine learning classifiers on 327 CTI reports collected from publicly available incident reports that cover events from May 2012 to February 2018. It is observed that the CTA profiles obtained attribute cyber threats with a high precision (i.e. 83% as compared to other publicly available CTA profiles, where the precision is 33%). The Deep Learning Neural Network (DLNN) based classifier also attributes cyber threats with a higher accuracy (i.e. 94% as compared to other classifiers).

Full Text