JavaScript is a popular scripting language for creating dynamic and interactive web pages. Unfortunately, JavaScript also provides the ground for web-based attacks that exploit vulnerabilities in web browsers and unnoticeably infect users with malicious software. Regular security tools, such as anti-virus scanners, increasingly fail to fend off this threat, as they are unable to cope with the rapidly evolving diversity and obfuscation of these JavaScript attacks. In this article, we present Cujo, a learning-based system for detection and prevention of JavaScript attacks. Embedded in a web proxy, Cujo transparently inspects web pages and blocks the delivery of malicious JavaScript code. A lightweight static and dynamic analysis is performed, which enables learning and detecting malicious patterns in the structure and behavior of JavaScript code. To operate the system in practice we introduce an architecture for automatically collecting and sanitizing data for retraining Cujo. We demonstrate the efficacy of this architecture in an empirical evaluation, where Cujo identifies 93% of real attacks with few false alarms—even if the attacks are present in benign web pages during training of the system.
Read full abstract