Abstract

In this paper, we propose a classification system that uses multiple autoencoder models for identifying malware images. It is crucial to accurately classify malware before we can deploy appropriate countermeasures to prevent them from spreading. Rapid malware classification is the first step in preparing effective countermeasures. Typical approaches to this problem, which can be divided into static or dynamic methods, are not suitable for efficient malware classification because they require either fixed malware patterns or lots of time to investigate, respectively. If the malware analysts have enough time and resources, they can analyze any malware thoroughly. However, finite resources mean they always suffer from a lack of time due to the malware that needs analyzing increasing at a dramatic rate. In the real world, new malware and variants of existing malware are constantly emerging. To address this issue, many researchers have developed approaches using machine learning techniques. However, to date these systems have had difficulty responding appropriately to the rapidly changing malware environment and also suffer from data imbalance problems in the training data. The system proposed in this paper consists of multiple autoencoder models that classify malware that has been converted to an image. Each autoencoder model classifies only one type of malware and is trained using only samples from the corresponding family, this allows the system to update quickly and mitigates the data imbalance problem. We demonstrate our method’s superior performance through various experiments compared to other state-of-the-art techniques using the Malimg dataset.

Highlights

  • Malware is any malicious code or program that can be harmful to computer systems

  • Malware classification is a widely carried out task that can be efficiently accomplished by machine learning models

  • A static method is a signature-based approach that uses a typical pattern as the basis for detection, such as checking how similar the file hash, Shannon entropy, N-gram, or JSON structure is to known malware [1]–[4]

Read more

Summary

INTRODUCTION

Malware is any malicious code or program that can be harmful to computer systems. These days, various types of malware are used in attempts to damage information systems, as such, detecting and preventing malware is essential to protect these information systems. Author et al.: Preparation of Papers for IEEE TRANSACTIONS and JOURNALS ware, malware variants, and obfuscated known malware This approach is time-consuming suffers from a considerable false-positive rate while being subject to platform/environment dependency [5]. Several deep learning models have been proposed to detect and classify malware using autoencoders [11], CNNs [12]–[14], RNNs [15], LSTM [16], or a combination of these [17]. The proposed system does not require any malware knowledge to use it because it classifies malware only by malware images, not by any other criteria such as binary features, functions, or the structure of the malware It quickly provides the malware family of the unknown malicious code and is updated with emerging malware and variants.

RELATED WORKS
SYSTEM STRUCTURE
VISUALIZATION OF MALWARE
EXPERIMENTAL RESULTS
Findings
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.