Understanding Cybersecurity Threat Trends Through Dynamic Topic Modeling.

Jennifer Sleeman,Tim Finin,Milton Halem

doi:10.3389/fdata.2021.601529

Jennifer Sleeman, Tim Finin + Show 1 more

Open Access

https://doi.org/10.3389/fdata.2021.601529

Copy DOI

Abstract

Cybersecurity threats continue to increase and are impacting almost all aspects of modern life. Being aware of how vulnerabilities and their exploits are changing gives helpful insights into combating new threats. Applying dynamic topic modeling to a time-stamped cybersecurity document collection shows how the significance and details of concepts found in them are evolving. We correlate two different temporal corpora, one with reports about specific exploits and the other with research-oriented papers on cybersecurity vulnerabilities and threats. We represent the documents, concepts, and dynamic topic modeling data in a semantic knowledge graph to support integration, inference, and discovery. A critical insight into discovering knowledge through topic modeling is seeding the knowledge graph with domain concepts to guide the modeling process. We use Wikipedia concepts to provide a basis for performing concept phrase extraction and show how using those phrases improves the quality of the topic models. Researchers can query the resulting knowledge graph to reveal important relations and trends. This work is novel because it uses topics as a bridge to relate documents across corpora over time.

Highlights

Cybersecurity is a crucial computing area vital to our society due to the rise in cyberattacks and the damage they can do (Symantec, 2019)
Dynamic topic modeling (DTM) (Blei and Lafferty, 2006) provides a means for performing topic modeling over time
Dynamic Topic ModelsDynamic topic modeling (DTM) has been used in many applications, including science research (Blei and Lafferty, 2006), software (Hu et al, 2015), finance (Morimoto and Kawasaki, 2017), music (Shalit et al, 2013), and climate change (Sleeman et al, 2016; Sleeman et al, 2017) to understand how particular domains have changed over time

Summary

INTRODUCTION

Cybersecurity is a crucial computing area vital to our society due to the rise in cyberattacks and the damage they can do (Symantec, 2019). Researchers have used artificial intelligence techniques to extract information from documents such as security bulletins, after-action reports, and descriptions of new software vulnerabilities for many years Most works in this area have used language understanding technology that extracts references to entities, such as malware instances, software products, IP addresses or process names, and relations between them. These data have been helpful for many purposes, they have not addressed temporal aspects of how the cybersecurity landscape has changed over the years. The heavy use of acronyms and multi-word phrases with non-compositional semantics exacerbates the problem To address these issues, we extract common cybersecurity concepts from Wikipedia data and identify phrases and acronyms that refer to them. We put forth this work to show how temporal analysis by means of cross-domain understanding can be applied to cybersecurity and could be used to foster a document-based search tool

BACKGROUND

Dynamic Topic Models

RELATED WORK

APPROACH

Extracting Knowledge From Unstructured Text

Topic Models Over Time

Automatic Knowledge Graph Generation and Use

CYBERSECURITY DATA SETS

EXPERIMENTS AND ANALYSIS

Concept Context Experiment

Contextual Classification Experiments

Dynamic Model Experiment

USING THE KNOWLEDGE GRAPH FOR SEARCH

CONCLUSION AND FUTURE WORK

Findings

DATA AVAILABILITY STATEMENT

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in big data	Publication Date: Jun 29, 2021
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Understanding Cybersecurity Threat Trends Through Dynamic Topic Modeling.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in big data

Lead the way for us

Similar Papers

Dynamic suicide topic modelling: Deriving population-specific, psychosocial and time-sensitive suicide risk variables from Electronic Health Record psychotherapy notes.
Maxwell Levis ... Carey J Russ
Clinical psychology & psychotherapy | VOL. 30
Maxwell Levis, et. al.Maxwell Levis ... Carey J Russ
26 Feb 2023
Clinical psychology & psychotherapy | VOL. 30

Descending Kernel Track of Static and Dynamic Topic Models in Topic Tracking
Yu Hong ... Yu Cang
Journal of Software | VOL. 23
Yu Hong, et. al.Yu Hong ... Yu Cang
24 Aug 2012
Journal of Software | VOL. 23

Cybersecurity vulnerabilities of cardiac implantable electronic devices: Communication strategies for clinicians—Proceedings of the Heart Rhythm Society's Leadership Summit
David J Slotwiner ... George F Van Hare
Heart Rhythm | VOL. 15
David J Slotwiner, et. al.David J Slotwiner ... George F Van Hare
10 May 2018
Heart Rhythm | VOL. 15

Video Behaviour Mining Using a Dynamic Topic Model
Timothy Hospedales ... Tao Xiang
International Journal of Computer Vision | VOL. 98
Timothy Hospedales, et. al.Timothy Hospedales ... Tao Xiang
08 Dec 2011
International Journal of Computer Vision | VOL. 98

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Understanding Cybersecurity Threat Trends Through Dynamic Topic Modeling.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in big data