A Survey of Malware Detection Techniques based on Machine Learning

Hoda El Merabet,Abderrahmane Hajraoui

doi:10.14569/ijacsa.2019.0100148

Hoda El Merabet, Abderrahmane Hajraoui

Open Access

https://doi.org/10.14569/ijacsa.2019.0100148

Copy DOI

Abstract

Diverse malware programs are set up daily focusing on attacking computer systems without the knowledge of their users. While some authors of these programs intend to steal secret information, others try quietly to prove their competence and aptitude. The traditional signature-based static technique is primarily used by anti-malware programs in order to counter these malicious codes. Although this technique excels at blocking known malware, it can never intercept new ones. The dynamic technique, which is often based on running the executable on a virtual environment, may be introduced by a number of anti-malware programs. The major drawbacks of this technique are the long period of scanning and the high consumption of resources. Nowadays, recent programs may utilize a third technique. It is the heuristic technique based on machine learning, which has proven its success in several areas based on the processing of huge amounts of data. In this paper we provide a survey of available researches utilizing this latter technique to counter cyber-attacks. We explore the different training phases of machine learning classifiers for malware detection. The first phase is the extraction of features from the input files according to previously chosen feature types. The second phase is the rejection of less important features and the selection of the most important ones which better represent the data contained in the input files. The last phase is the injection of the selected features in a chosen machine learning classifier, so that it can learn to distinguish between benign and malicious files, and give accurate predictions when confronted to previously unseen files. The paper ends with a critical comparison between the studied approaches according to their performance in malware detection.

Highlights

Every day, the AV-TEST1 institute registers over 250000 new malware
We focus on researches built for the detection of malware designed for Windows operating systems
The transformation of the input dataset into another exploitable space brings a great gain in both data processing time and performance measures

Summary

A Survey of Malware Detection Techniques based on Machine Learning

The traditional signature-based static technique is primarily used by anti-malware programs in order to counter these malicious codes. This technique excels at blocking known malware, it can never intercept new ones. Recent programs may utilize a third technique It is the heuristic technique based on machine learning, which has proven its success in several areas based on the processing of huge amounts of data. We explore the different training phases of machine learning classifiers for malware detection. The last phase is the injection of the selected features in a chosen machine learning classifier, so that it can learn to distinguish between benign and malicious files, and give accurate predictions when confronted to previously unseen files.

INTRODUCTION

FEATURE EXTRACTION

Signatures Extraction

DLL Function Calls Extraction

Binary Sequences Extraction

PE File Header Fields Extraction

Entropy Signals Extraction

FEATURE SELECTION TECHNIQUES

Information Gain

Random Forest

Calculation of Accuracy by Considering Each Attribute Separately

Wavelet Transform

MACHINE LEARNING CLASSIFIERS

Neural Network

CLASSIFICATION OF THE STUDIED RESEARCHES

Findings

CONCLUSIONS

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Advanced Computer Science and Applications	Publication Date: Jan 1, 2019
Citations: 33	License type: cc-by

R Discovery Prime

R Discovery Prime

A Survey of Malware Detection Techniques based on Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications

Lead the way for us

Similar Papers

Ensemble learning for effective run-time hardware-based malware detection
Hossein Sayadi ... Nisarg Patel
-
Hossein Sayadi, et. al.Hossein Sayadi ... Nisarg Patel
24 Jun 2018
24 Jun 2018

Comprehensive assessment of run-time hardware-supported malware detection using general and ensemble learning
Hossein Sayadi ... Setareh Rafatirad
-
Hossein Sayadi, et. al.Hossein Sayadi ... Setareh Rafatirad
08 May 2018
08 May 2018

StealthMiner: Specialized Time Series Machine Learning for Run-Time Stealthy Malware Detection based on Microarchitectural Features
Hossein Sayadi ... Tinoosh Mohsenin
-
Hossein Sayadi, et. al.Hossein Sayadi ... Tinoosh Mohsenin
07 Sep 2020
07 Sep 2020

Malware Classification Approaches Using Machine Learning Techniques: A Review
Shivarti Naik ... Amita Dessai
-
Shivarti Naik, et. al.Shivarti Naik ... Amita Dessai
10 Dec 2021
10 Dec 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Survey of Malware Detection Techniques based on Machine Learning

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Advanced Computer Science and Applications