MALGRA: Machine Learning and N-Gram Malware Feature Extraction and Detection System

Muhammad Ali,Gueltoum Bendiab,Stavros Shiaeles,Bogdan Ghita

doi:10.3390/electronics9111777

Muhammad Ali, Gueltoum Bendiab + Show 2 more

Open Access

https://doi.org/10.3390/electronics9111777

Copy DOI

Journal: Electronics	Publication Date: Oct 26, 2020
Citations: 39	License type: CC BY 4.0

Affiliation: University of Plymouth, University of Portsmouth

Abstract

Detection and mitigation of modern malware are critical for the normal operation of an organisation. Traditional defence mechanisms are becoming increasingly ineffective due to the techniques used by attackers such as code obfuscation, metamorphism, and polymorphism, which strengthen the resilience of malware. In this context, the development of adaptive, more effective malware detection methods has been identified as an urgent requirement for protecting the IT infrastructure against such threats, and for ensuring security. In this paper, we investigate an alternative method for malware detection that is based on N-grams and machine learning. We use a dynamic analysis technique to extract an Indicator of Compromise (IOC) for malicious files, which are represented using N-grams. The paper also proposes TF-IDF as a novel alternative used to identify the most significant N-grams features for training a machine learning algorithm. Finally, the paper evaluates the proposed technique using various supervised machine-learning algorithms. The results show that Logistic Regression, with a score of 98.4%, provides the best classification accuracy when compared to the other classifiers used.

Highlights

Malware is a broad term that refers to any piece of software designed intentionally to damage the normal functionality of a computer or a network [1]
We aim to improve the current state of the art in malware analysis by presenting the design and experimental evaluation of a malware detection system, with the following contributions: (a) Malware behavioural modelling using advance sandbox: In contrast to other studies and research work where the traditional sandboxes such as Cuckoo, Norman, Joe, etc. were used to model the behaviour of malware as from our previous research work [36], we found that they are not so effective in capturing the behaviour of advanced and sophisticated malware; we have utilized AI-based sandbox in this work to perform dynamic analysis and to model the behaviour of the malware
(c) Optimise Classification: We present the design of a classification system that uses Naive Bayes, Decision Tree, Random Forest to detected malware using new features

Summary

Introduction

Malware is a broad term that refers to any piece of software designed intentionally to damage the normal functionality of a computer or a network [1]. Current malware target widely and indiscriminately from individuals and residential customers to IT systems within large organisations or critical country-wide infrastructures (including nuclear plants and water supply systems), which traditionally have been considered highly secure [2]. Within this spectrum, according to recent reports [3,4], there is a significant increase in the production of malware variants that are targeting critical infrastructures. Existing malware variants are continuously evolving, as malware writers improve their detection avoidance mechanisms.

Objectives

Methods

Findings

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

MALGRA: Machine Learning and N-Gram Malware Feature Extraction and Detection System

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics

Lead the way for us

Similar Papers

A novel method for malware detection on ML-based visualization technique
Xinbo Liu ... Jiliang Zhang
Computers & Security | VOL. 89
Xinbo Liu, et. al.Xinbo Liu ... Jiliang Zhang
02 Dec 2019
Computers & Security | VOL. 89

A Survey of the Recent Trends in Deep Learning Based Malware Detection
Umm-E-Hani Tayyab ... Asifullah Khan
Journal of Cybersecurity and Privacy | VOL. 2
Umm-E-Hani Tayyab, et. al.Umm-E-Hani Tayyab ... Asifullah Khan
28 Sep 2022
Journal of Cybersecurity and Privacy | VOL. 2

CTIMD: Cyber threat intelligence enhanced malware detection using API call sequences with parameters
Tieming Chen ... Tiantian Zhu
Computers & Security | VOL. 136
Tieming Chen, et. al.Tieming Chen ... Tiantian Zhu
01 Oct 2023
Computers & Security | VOL. 136

Markov Image with Transfer Learning for Malware Detection and Classification
Lok Man Kwan
-
Lok Man KwanLok Man Kwan
01 Nov 2022
01 Nov 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MALGRA: Machine Learning and N-Gram Malware Feature Extraction and Detection System

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronics