Abstract

Host-based Intrusion Detection System (HIDS) is an effective last line of defense for defending against cyber security attacks after perimeter defenses (e.g., Network-based Intrusion Detection System and Firewall) have failed or been bypassed. HIDS is widely adopted in the industry as HIDS is ranked among the top two most used security tools by Security Operation Centers (SOC) of organizations. Although effective and efficient HIDS is highly desirable for industrial organizations, the evolution of increasingly complex attack patterns causes several challenges resulting in performance degradation of HIDS (e.g., high false alert rate creating alert fatigue for SOC staff). Since Natural Language Processing (NLP) methods are better suited for identifying complex attack patterns, an increasing number of HIDS are leveraging the advances in NLP that have shown effective and efficient performance in precisely detecting low footprint, zero-day attacks and predicting an attacker’s next steps. This active research trend of using NLP in HIDS demands a synthesized and comprehensive body of knowledge of NLP-based HIDS. Despite the drastically growing adoption of NLP in HIDS development, there has been relatively little effort allocated to systematically analyze and synthesize the available peer review literature to understand how NLP is used in HIDS development. The lack of a synthesized and comprehensive body of knowledge on such an important topic motivated us to conduct a Systematic Literature Review (SLR) of the papers on the end-to-end pipeline of the use of NLP in HIDS development. For the end-to-end NLP-based HIDS development pipeline, we identify, taxonomically categorize and systematically compare the state-of-the-art of NLP methods usage in HIDS, attacks detected by these NLP methods, datasets and evaluation metrics which are used to evaluate the NLP-based HIDS. We highlight the relevant prevalent practices, considerations, advantages and limitations to support the HIDS developers. We also outline the future research directions for the NLP-based HIDS development.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call