Abstract
The cyber threat intelligence (CTI) knowledge graph is a valuable tool for aiding security practitioners in the identification and analysis of cyberattacks. These graphs are constructed from CTI data, organized into relational triples, where each triple comprises two entities linked by a particular relation. However, as the volume of CTI data is expanding at a faster rate than predicted, existing technologies are unable to extract relational triples quickly and accurately. This work mainly focuses on the extraction of relational triples in CTI data, which is achieved by an enhanced representation and binary tagging framework (ERBTF). Firstly, we introduce embedding representations for relations and concatenate these with word embeddings to obtain the initial hidden representation. Subsequently, we employ a novel dilated convolutional encoder that consists of a dilated convolution neural network, gate mechanism and residual connection to enhance the learned contextual representation. Afterwards, we adopt an attention module that includes multi-head self-attention and position-wise feed-forward neural network to allocate greater attention to words that significantly influence the specific relation. Additionally, we utilize the straightforward yet efficient binary entity tagger to identify subject and object entities under different relations for constructing relational triples. We conduct massive experiments on relational triple extraction from CTI data, the results show that ERBTF is superior to the existing relation extraction models, and achieves state-of-the-art performance.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.