Data and knowledge-driven named entity recognition for cyber security

Chen Gao,Xuan Zhang,Hui Liu

doi:10.1186/s42400-021-00072-y

Abstract

Named Entity Recognition (NER) for cyber security aims to identify and classify cyber security terms from a large number of heterogeneous multisource cyber security texts. In the field of machine learning, deep neural networks automatically learn text features from a large number of datasets, but this data-driven method usually lacks the ability to deal with rare entities. Gasmi et al. proposed a deep learning method for named entity recognition in the field of cyber security, and achieved good results, reaching an F1 value of 82.8%. But it is difficult to accurately identify rare entities and complex words in the text.To cope with this challenge, this paper proposes a new model that combines data-driven deep learning methods with knowledge-driven dictionary methods to build dictionary features to assist in rare entity recognition. In addition, based on the data-driven deep learning model, an attention mechanism is adopted to enrich the local features of the text, better models the context, and improves the recognition effect of complex entities. Experimental results show that our method is better than the baseline model. Our model is more effective in identifying cyber security entities. The Precision, Recall and F1 value reached 90.19%, 86.60% and 88.36% respectively.

Highlights

There is a large amount of unstructured cyber security data on the Internet, which is difficult to be directly identified and utilized by the cyber security system
These data usually come from cyber security blogs, company communities, and related databases, such as Common Vulnerabilities and Exposures (CVE) and National Vulnerability Database (NVD)
All above models have improved the effect of cyber security entity recognition to a certain extent, but they did not combine domain knowledge and data-driven deep learning methods well

Summary

Introduction

There is a large amount of unstructured cyber security data on the Internet, which is difficult to be directly identified and utilized by the cyber security system. Collobert et al (2011) proposed an effective neural network model that can learn word vectors from a large amount of unlabelled texts. Huang et al (2015) proposed the BiLSTM-CRF model, which is a combination of neural network and statistical machine learning methods.

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Cybersecurity	Publication Date: May 3, 2021
Citations: 24	License type: open-access

R Discovery Prime

R Discovery Prime

Data and knowledge-driven named entity recognition for cyber security

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cybersecurity

Lead the way for us

Similar Papers

FEATURES OF MODERN CONCEPTUAL AND TERMINOLOGICAL APPARATUS IN THE FIELD OF TRAINING OF CYBER SECURITY SPECIALISTS
Serhiі Horlichenko
Cybersecurity: Education, Science, Technique | VOL. 3
Serhiі HorlichenkoSerhiі Horlichenko
01 Jan 2024
Cybersecurity: Education, Science, Technique | VOL. 3

СТРАТЕГІЯ КІБЕРБЕЗПЕКИ ЄС (2021) НА ЦИФРОВЕ ДЕСЯТИЛІТТЯ: ПЕРСПЕКТИВИ ДЛЯ УКРАЇНИ

-

01 Mar 2021
СТРАТЕГІЯ КІБЕРБЕЗПЕКИ ЄС (2021) НА ЦИФРОВЕ ДЕСЯТИЛІТТЯ: ПЕРСПЕКТИВИ ДЛЯ УКРАЇНИ

МОДЕЛЬ ПІДГОТОВКИ ФАХІВЦІВ У СФЕРІ ІНФОРМАЦІЙНОЇ ТА КІБЕРНЕТИЧНОЇ БЕЗПЕКИ В ЗАКЛАДАХ ВИЩОЇ ОСВІТИ УКРАЇНИ
Volodymyr L Buriachok ... Pavlo M Skladannyi
Information Technologies and Learning Tools | VOL. 67
Volodymyr L Buriachok, et. al.Volodymyr L Buriachok ... Pavlo M Skladannyi
30 Oct 2018
Information Technologies and Learning Tools | VOL. 67

Evaluation of Variable Density and Data-Driven K-Space Undersampling for Compressed Sensing Magnetic Resonance Imaging.
Frank Zijlstra ... Peter R Seevinck
Investigative radiology | VOL. 51
Frank Zijlstra, et. al.Frank Zijlstra ... Peter R Seevinck
01 Jun 2016
Investigative radiology | VOL. 51

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Data and knowledge-driven named entity recognition for cyber security

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Cybersecurity