Abstract Completing the classification of tactics and techniques in cyber threat intelligence (CTI) is an important way to obtain tactics, techniques and procedures (TTPs) and portray the behavior of cyber attacks. However, the high level of abstraction of tactics and techniques information and their presence in CTI, usually in the form of natural language text, make it difficult for traditional manual analysis methods and feature engineering-based machine learning methods to complete the classification of tactics and techniques effectively. Meanwhile, flat deep learning methods do not perform well in classifying more fine-grained techniques due to their inability to exploit the hierarchical relationship between tactics and techniques. Therefore, this paper regards the tactics and techniques of TTPs defined in Adversarial Tactics, Techniques and Common Knowledge knowledge base as labels and proposes a Convolutional Neural Network (CNN) model based on hierarchical knowledge migration and attention mechanism for classifying tactics and techniques in CTI, named HM-ACNN (CNN based on hierarchical knowledge migration and attention mechanism). HM-ACNN classifies tactics and techniques into two phases, and the underlying network model for both phases is the Attention-based CNN network. The first step in HM-ACNN is converting the CTI text into a two-dimensional image based on the word embedding model, and then start training the classification of tactics through the CNN structure based on the attention mechanism before the classification of techniques. Secondly, after the tactics classification training is completed, the tactic-to-technique knowledge migration is then completed by transforming the parameters of the CNN layer and the attention layer in the tactics classification process based on the special hierarchical relationship between tactics and techniques. Then, the classification of techniques is finished by fine-tuning. The experimental results show that HM-ACNN performs well in the tactics and techniques classification tasks, and the metric F1 values reach 93.66% and 86.29%, which are better than other models such as CNN, Recurrent Neural Network and CRNN (Recurrent Convolutional Neural Networks).
Read full abstract