Web shell is a malicious script file that can harm web servers. Web shell is often used by intruders to perform a series of malicious operations on website servers, such as privilege escalation and sensitive information leakage. Existing web shell detection methods have some shortcomings, such as viewing a single network traffic behavior, using simple signature comparisons, and adopting easily bypassed regex matches. In view of the above deficiencies, a web shell detection method based on multiview feature fusion is proposed based on the PHP language web shell. Firstly, lexical features, syntactic features, and abstract features that can effectively represent the internal meaning of web shells from multiple levels are integrated and extracted. Secondly, the Fisher score is utilized to rank and filter the most representative features, according to the importance of each feature. Finally, an optimized support vector machine (SVM) is used to establish a model that can effectively distinguish between web shell and normal script. In large-scale experiments, the final classification accuracy of the model on 1056 web shells and 1056 benign web scripts reached 92.18%. The results also surpassed well-known web shell detection tools such as VirusTotal, ClamAV, LOKI, and CloudWalker, as well as the state-of-the-art web shell detectionmethods.
Read full abstract