Abstract

A drive-by download is a method of hackers planting the Web Trojan, which exploits browser vulnerabilities to execute malicious software. Because people usually access web pages with various browsers daily, drive-by downloads have become one of the most common threats in recent years. Most previous studies utilize the abstract syntax tree(AST) with deep learning methods to detect such attacks, which achieved high accuracy but are time-consuming and challenging to explain. Also, some methods use dynamic analysis, which needs a specific environment and is time-consuming with the complex operation. In order to solve these problems, the paper proposes DDIML, an explainable machine learning model based on novel features with static analysis. These features are extracted from five aspects: code obfuscation, URL redirection, special behaviors, encoding characters, and CSS attributes. The most popular machine learning algorithm, Random forest, is applied for building the classifier detection model. In addition, we use both local and global explanations to improve the model and prove that the proposed model could be trusted. The Experimental results show that our proposed model can efficiently detect drive-by downloads with a detection precision of 0.983 and a recall of 0.980. The average detection time for each sample is only 16.07ms in total.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.