A Machine Learning Approach to Malicious JavaScript Detection using Fixed Length Vector Representation

Samuel Ndichu,Kouichirou Okada,Takeshi Misu,Seiichi Ozawa

doi:10.1109/ijcnn.2018.8489414

Abstract

To add more functionality and enhance usability of web applications, JavaScript (JS) is frequently used. Even with many advantages and usefulness of JS, an annoying fact is that many recent cyberattacks such as drive-by-download attacks exploit vulnerability of JS codes. In general, malicious JS codes are not easy to detect, because they sneakily exploit vulnerabilities of browsers and plugin software, and attack visitors of a web site unknowingly. To protect users from such threads, the development of an accurate detection system for malicious JS is soliciting. Conventional approaches often employ signature and heuristic-based methods, which are prone to suffer from zero-day attacks, i.e., causing many false negatives and/or false positives. For this problem, this paper adopts a machine-learning approach to feature learning called Doc2Vec, which is a neural network model that can learn context information of texts. The extracted features are given to a classifier model (e.g., SVM and neural networks) and it judges the maliciousness of a JS code. In the performance evaluation, we use the D3M Dataset (Drive-by-Download Data by Marionette) for malicious JS codes and JSUPACK for benign ones for both training and test purposes. We then compare the performance to other feature learning methods. Our experimental results show that the proposed Doc2Vec features provide better accuracy and fast classification in malicious JS code detection compared to conventional approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Machine Learning Approach to Malicious JavaScript Detection using Fixed Length Vector Representation

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A machine learning approach to detection of JavaScript-based attacks using AST features and paragraph vectors
Samuel Ndichu ... Kazuo Makishima
Applied Soft Computing | VOL. 84
Samuel Ndichu, et. al.Samuel Ndichu ... Kazuo Makishima
22 Aug 2019
Applied Soft Computing | VOL. 84

Detection of Malicious JavaScript Code in Web Pages
Dharmaraj R Patil ... J B Patil
Indian Journal of Science and Technology | VOL. 10
Dharmaraj R Patil, et. al.Dharmaraj R Patil ... J B Patil
19 May 2017
Indian Journal of Science and Technology | VOL. 10

Detecting Web-Based Attacks with SHAP and Tree Ensemble Machine Learning Methods
Samuel Ndichu ... Takeshi Takahashi
Applied Sciences | VOL. 12
Samuel Ndichu, et. al.Samuel Ndichu ... Takeshi Takahashi
22 Dec 2021
Applied Sciences | VOL. 12

Deobfuscation, unpacking, and decoding of obfuscated malicious JavaScript for machine learning models detection performance improvement
Samuel Ndichu ... Sangwook Kim
CAAI Transactions on Intelligence Technology | VOL. 5
Samuel Ndichu, et. al.Samuel Ndichu ... Sangwook Kim
17 Jul 2020
CAAI Transactions on Intelligence Technology | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Machine Learning Approach to Malicious JavaScript Detection using Fixed Length Vector Representation

Abstract

Talk to us

Similar Papers