Abstract

In the ever-changing cyber threat landscape, evolving malware threats demand a new technique for their detection. This paper puts forward a strategy for distinguishing malware programs based on transfer learning procedures. The proposed method, known as DTMIC — deep transfer learning for malware image classification, leverages the capabilities of deep Convolutional Neural Network (CNN) architecture previously trained with ImageNet dataset (>10 million images) for malware classification. Window’s portable executable files (PEs) are converted into grayscale images, with the perception that similar malware families fundamentally show the same characteristics when represented as visualized images. Grayscale images serve as input to the customized deep CNN architecture. Features extracted from the convolutional layers of the deep CNN model are flattened and fed into a fully connected dense layer. In addition, to avoid the overfitting problem that many CNN models face, a regularization technique called Early Stopping is employed to monitor the validation loss with configured parameters. The effectiveness and robustness of the model are evaluated on two benchmark datasets — the MalImg dataset (9339 malware samples of 25 families) and the Microsoft BIG dataset (10868 malware samples of 9 families). DTMIC achieved 98.92% test accuracy for MalImg datasets and 93.19% for Microsoft datasets. For comparative analysis, well-established CNN architectures such as VGG16, VGG19, ResNet50, and Google’s inceptionV3 are implemented, both as a feature extractor and a classifier. Experimental results reveal that the proposed DTMIC method outperforms the selected baseline models and is resilient to packed and encrypted malware. Moreover, this study validates the model’s efficacy on recent and real-world malware samples collected on Honeypots in the wild, with an accuracy of 96.43%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call