Leveraging BERT's Power to Classify TTP from Unstructured Text

Paulo M M R Alves,Geraldo P R Filho,Vinicius P Goncalves

doi:10.1109/wcnps56355.2022.9969697

Abstract

Tactics, Techniques and Procedures (TTP) are valuable information to cyber-security analysts. However, they are mostly disseminated through unstructured text. This work presents a proposal for tackling this problem by using BERT models, a state-of-the-art approach in Natural Language Processing. We investigate the effect of some chosen hyperparameters on the fine-tuning of the models. MITRE's example sentences are used to train (fine-tuning step) eleven BERT models. The purpose is to find the best model and the finest combination of hyperparameters for the task of classifying TTPs according to the ATT&CK framework. As a result, we observed that the best models presented an accuracy of 82.64% and 78.75% on two datasets tested, demonstrating the potential of the application of BERT models in the complex task of TTP classification. At last, we gather some insights from the misclassified data that help better understand the dataset and how the models manage and classify the proposed data.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Leveraging BERT's Power to Classify TTP from Unstructured Text

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

TIM: threat context-enhanced TTP intelligence mining on unstructured threat data
Yizhe You ... Baoxu Liu
Cybersecurity | VOL. 5
Yizhe You, et. al.Yizhe You ... Baoxu Liu
01 Feb 2022
Cybersecurity | VOL. 5

Portability of natural language processing methods to detect suicidality from clinical text in US and UK electronic health records.
Marika Cusick ... Jyotishman Pathak
Journal of affective disorders reports | VOL. 10
Marika Cusick, et. al.Marika Cusick ... Jyotishman Pathak
01 Dec 2022
Journal of affective disorders reports | VOL. 10

Abstract 12092: Natural Language Processing to Identify Reasons for Gender Disparities in Statin Use
Celeste Witting ... Ashish Sarraju
Circulation | VOL. 146
Celeste Witting, et. al.Celeste Witting ... Ashish Sarraju
08 Nov 2022
Abstract 12092: Natural Language Processing to Identify Reasons for Gender Disparities in Statin Use
Celeste Witting ... Ashish Sarraju

NLP Applications for Big Data Analytics Within Healthcare
Aadarsh Choudhary ... Anurag Choudhary
-
Aadarsh Choudhary, et. al.Aadarsh Choudhary ... Anurag Choudhary
01 Jan 2021
01 Jan 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Leveraging BERT's Power to Classify TTP from Unstructured Text

Abstract

Talk to us

Similar Papers