A framework for threat intelligence extraction and fusion

Yongyan Guo,Zhengyu Liu,Cheng Huang,Nannan Wang,Hai Min,Wenbo Guo,Jiayong Liu

doi:10.1016/j.cose.2023.103371

Abstract

Cyber-attacks, with various emerging attack techniques, are becoming increasingly sophisticated and difficult to deal with, posing great threats to companies and every individual. Therefore, analyzing attack incidents and tracing the attack groups behind them becomes extremely important. Threat intelligence provides a new technical solution for attack traceability by constructing Cybersecurity Knowledge Graph (CKG). In this paper, we propose a framework for threat intelligence extraction and fusion, which is able to extract, correlate and unify cybersecurity entity-relation triples from structured and unstructured data. However, the existing entity and relation extraction for cybersecurity concepts uses the traditional pipeline model that suffers from error propagation and ignores the connection between the two subtasks. To solve the above problem, we propose a joint entity and relation extraction model for cybersecurity concepts. We model the joint extraction problem as a multiple sequence labeling problem, generating separate label sequences for different relations, which contain information about the involved entities and the subject and object of that relation. Experimental results on Open Source Intelligence (OSINT) data show that the F1 value of the joint model is 81.37%, which is better than the previous pipeline model. For the knowledge fusion, we propose an improved Levenshtein distance to correlate the same entities extracted from different data sources to construct a preliminary CKG, which is demonstrated in the Experiments section.

Full Text