SDIF-CNN: Stacking deep image features using fine-tuned convolution neural network models for real-world malware detection and classification

Sanjeev Kumar,Kajal Panda

doi:10.1016/j.asoc.2023.110676

Abstract

The detection of malware is a complex problem in the area of Internet security. Developing a malware defense system that is less costly to detect large-scale malware is needed. This paper proposes a novel malware detection and classification architecture based on image visualization as SDIF-CNN: Stacking deep image features using fine-tuned convolution neural networks. The hybrid methodology of transfer learning as fine-tuning and feature extractor of deep convolution neural network models is designed. At first, the pre-trained VGG16 CNN model is deeply fine-tuned with different hyperparameters, including the number of layers, learning rate, momentum, etc. The transfer learning-based fine-tuned VGG16 model is used as a feature extractor along with the three similar pre-trained CNN models, VGG19, ResNet50, and InceptionV3, to obtain the diverse feature map. The extracted features are horizontally concatenated to construct a single feature map. The different feature selection methodologies, including filter-based methods and embedded methods, such as linear regression and random forest, are designed to discard the irrelevant features from a stacked feature map. After that, this study uses six machine learning and deep learning classifiers- K-Nearest Neighbor (K-NN), Support Vector Machine (SVM), Random Forest (RF), Multi-Layer Perceptron (MLP), Extra Tree (ET), and Gaussian Naive Bayes (GNB) by using the stacked feature map as a training feature vector. The hyperparameter optimization of the MLP model as the best classifier is performed using a randomized search algorithm to devise an optimal classifier. The experiments are performed using a publicly benchmarked MalImg dataset of 9339 images from 25 families. The model is also validated on real-world and packed malicious programs to prove the generalization of the proposed methodology in detecting real-world malware. In the proposed system, the MLP model obtained the best performance results as 98.55% accuracy, 99% precision, 99% recall, and 99% F1-score for MalImg datasets, and accuracy of 94.78% for real-world malware datasets. The proposed methodology is resilient to commonly used obfuscation techniques and does not depend upon code disassembly, reverse engineering analysis, and highly resource-intensive dynamic analysis.

Full Text