Abstract

A webshell is a command execution environment in the form of web pages. It is often used by attackers as a backdoor tool for web server operations. Accurately detecting webshells is of great significance to web server protection. Most security products detect webshells based on feature-matching methods—matching input scripts against pre-built malicious code collections. The feature-matching method has a low detection rate for obfuscated webshells. However, with the help of machine learning algorithms, webshells can be detected more efficiently and accurately. In this paper, we propose a new PHP webshell detection model, the NB-Opcode (naïve Bayes and opcode sequence) model, which is a combination of naïve Bayes classifiers and opcode sequences. Through experiments and analysis on a large number of samples, the experimental results show that the proposed method could effectively detect a range of webshells. Compared with the traditional webshell detection methods, this method improves the efficiency and accuracy of webshell detection.

Highlights

  • With the development of web technology and the explosive growth of information, web security becomes more and more important

  • Opcode is the intermediate language after PHP script compilation, and its relationship with PHP is analogous to Java virtual machine (JVM) byte-code’s relationship to Java

  • A webshell detection method based on a naïve Bayes algorithm and opcode sequence is proposed

Read more

Summary

Introduction

With the development of web technology and the explosive growth of information, web security becomes more and more important Web vulnerabilities such as SQL injection and XSS attacks [1] are some of the most common security problems. Attackers often exploit vulnerabilities in the system or web applications to upload a malicious file or malicious code to the webserver. Attackers use a range of methods to bypass traditional detection, including malicious function segmentation, Base encoding, and other techniques. These traditional webshell detection methods are ineffective in detecting webshells that have been obfuscated.

Webshell
Simple Webshell
Machine Learning
Unsupervised Learning
Static and Dynamic Detection
Flow Analysis Detection
Log Analysis Detection
Behavior Analysis Detection
Statistical Analysis
Threats
Plain Webshell
Obfuscated Webshell
Split Webshell
Remote Webshell
Proposed Solution
Opcode
Data Preprocessing
Feature Extraction and Representation
Word Bag and TF-IDF Models
Model Training and Validation
Experiments
Effectiveness of the Approach
Conclusions and Future Work
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call