Malware analysis is a critical aspect of cybersecurity, aiming to identify and differentiate malicious software from benign programmes to protect computer systems from security threats. Despite advancements in cybersecurity measures, malware continues to pose significant risks in cyberspace, necessitating accurate and rapid analysis methods. This paper introduces an innovative approach to malware classification using image analysis, involving three key phases: converting operation codes into RGB image data, employing a Generative Adversarial Network (GAN) for synthetic oversampling, and utilising a simplified Vision Transformer (ViT)-based classifier for image analysis. The method enhances feature richness and explainability through visual imagery data and addresses imbalanced classification using GAN-based oversampling techniques. The proposed framework combines the strengths of convolutional autoencoders, hybrid classifiers, and adapted ViT models to achieve a balance between accuracy and computational efficiency. As shown in the experiments, our convolutional-free approach possesses excellent accuracy and precision compared with convolutional models and outperforms CNN models on two datasets, thanks to the multi-head attention mechanism. On the Big2015 dataset, our model outperforms other CNN models with an accuracy of 0.8369 and an AUC of 0.9791. Specifically, our model reaches an accuracy of 0.9697 and an F1 score of 0.9702 on MALIMG, which is extraordinary.
Read full abstract