Understanding the Influence of AST-JS for Improving Malicious Webpage Detection

Muhammad Fakhrur Rozi,Tao Ban,Takeshi Takahashi,Sangwook Kim,Seiichi Ozawa,Daisuke Inoue

doi:10.3390/app122412916

Muhammad Fakhrur Rozi, Tao Ban + Show 4 more

Open Access

https://doi.org/10.3390/app122412916

Copy DOI

Abstract

JavaScript-based attacks injected into a webpage to perpetrate malicious activities are still the main problem in web security. Recent works have leveraged advances in artificial intelligence by considering many feature representations to improve the performance of malicious webpage detection. However, they did not focus on extracting the intention of JavaScript content, which is crucial for detecting the maliciousness of a webpage. In this study, we introduce an additional feature extraction process that can capture the intention of the JavaScript content of the webpage. In particular, we developed a framework for obtaining a JavaScript representation based on the abstract syntax tree for JavaScript (AST-JS), which enriches the webpage features for a better detection model. Moreover, we investigated the influence of our proposed feature on improving the model’s performance by using the Shapley additive explanation method to define the significance of each feature category compared to our proposed feature. The evaluation shows that adding the AST-JS feature can improve the performance for detecting malicious webpage compared to previous work. We also found that AST significantly influences performance, especially for webpages with JavaScript content.

Full Text