Abstract

In recent years, malware and their variants have proliferated, which poses a grave threat to the systems and networks’ security, so it is urgent to detect and classify malware in time to prevent the spread of malicious activities. However, the existing malware detection and classification methods can't meet the requirement of the application perfectly. Among them, machine learning-based approaches generally face the dilemma of balancing efficiency and accuracy due to imperfect feature representation, while deep learning-based methods are usually computationally intense to train and deploy. In order to solve the problem, we focus on improving the feature extraction and classification model, and propose a Byte and Hex n-gram based Malware Detection and Classification method called BHMDC in this paper. For malware detection, LightGBM is used to detect malware with just 256-dimensional byte unigram features, which achieves an accuracy of more than 99.70% on two built datasets with less time consumption. For malware classification, block byte unigram and hex n-gram are proposed and combined together as the feature, which can preserve more properties and profile executable files in a multi-granular way, then random forest is used to optimize the feature by removing redundant information and reducing the dimensionality, and LightGBM is finally utilized to identify malware families. The performance of the proposed approach is evaluated through experiments, and it is compared with state-of-the-art methods. The proposed approach produces 99.264% accuracy on Microsoft malware classification challenge dataset and 99.775% accuracy on Malimg dataset respectively, which substantially outperforms the other approaches. Promising experimental results reveal that BHMDC can be used in antivirus software to detect malware variants and help security analysts to identify malware families.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call