Abstract

The proliferation of malware has exhibited a substantial surge in both quantity and diversity, posing significant threats to the Internet and indispensable network applications. The accurate and effective classification makes a pivotal role in defending against malware. Numerous approaches employ supervised learning techniques, specifically Convolutional Neural Networks (CNNs), to train feature extractors. However, acquiring a substantial quantity of labeled samples incurs significant expenses, and relying solely on CNNs as feature extractors may result in restricted local receptive fields, consequently compromising the preservation of crucial features. In order to address these constraints, we propose an effective malware classification approach, denoted as MalSort, which leverages the masked self-supervised framework with Swin Transformer. Initially, each instance of malware is transformed into a color image. Furthermore, the Swin Transformer self-supervised framework is utilized to extract multi-scale key feature vectors from a randomly masked partial color image, while the prediction module is employed to predict the masked image. Ultimately, the pre-trained encoder is fine-tuned using the malware dataset to effectively carry out a malware classification task. Our MalSort exhibits a reduced reliance on labeled data samples during the training phase, thereby obviating the necessity for extensive amounts of labeled data. Consequently, the MalSort conserves hardware resources and improve its training efficiency. The experimental results indicate that the MalSort outperforms existing models by achieving a classification accuracy of 97.85%, a recall of 97.63%, a precision of 97.85%, and an F1-score of 97.85% on the BIG2015 dataset. Similarly, on the Malimg dataset, the model achieves percentages of 98.28%, 98.18%, 98.19%, and 98.28% for classification accuracy, recall, precision, and F1-score, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call