ConRec: malware classification using convolutional recurrence

Abhishek Mallik,Anavi Khetarpal,Sanjay Kumar

doi:10.1007/s11416-022-00416-3

Abstract

Today, the extensive reliae on technology has exposed us to a constant threat of sophisticated malware attacks. Various automated malware production techniques have evolved, some of which reuse specific segments of existing malware to produce new malware, making malware detection challenging. In this paper, we propose a Convolutional Recurrence based malware classification technique that exploits the visual recurrences in the grayscale images of the malware samples belonging to the same malware families. Firstly, we convert the malware samples into grayscale images to capture the structural similarities from the malware samples using a Convolutional Neural Network architecture. Then we perform data augmentation to counter the effects of high data imbalance and reduce the class bias, such that training on that dataset would generate a more generalized framework. The augmented dataset is then passed through a VGG16 based feature extractor to extract the visual outliers amongst the malware families. Now, the extracted features are processed by passing them through two stacked BiLSTM layers. The outputs generated by the BiLSTM layers and the VGG16 layer are then merged to perform the final classification of the malware sample into its malware family. The model’s performance is further improved by using proper hyperparameter tuning. We compare the performance of our algorithm against several baseline methods and some contemporary state-of-the-art methods for visual malware detection by utilizing two benchmarked datasets. The obtained experimental results reveal the utility and efficacy of our proposed malware family classification technique.

Full Text