Abstract

Currently, among the millions of Android applications, there exist numerous malicious programs that pose significant threats to people’s security and privacy. Therefore, it is imperative to develop approaches for detecting Android malware. Recently developed malware detection methods usually rely on various features, such as application programming interface (API) sequences, images, and permissions, thereby ignoring the importance of source code and the associated comments, which are not usually included in malware. Therefore, we propose Android-SEM, which is an Android source code semantic enhancement model based on transfer learning. Our proposed model is built upon the Transformer architecture to achieve a pretraining framework for generating code comments from malware source code. The performance of the pretraining framework is optimized using a generative adversarial network. Our proposed model relies on a novel regression model-based filter to retain high-quality comments and source code for feature fusion pertinent to semantic enhancement. Creatively, and contrary to conventional methods, we incorporated a quantum support vector machine (QSVM) for classifying malicious Android code by combining quantum machine learning and classical deep learning models. The results proved that Android-SEM achieves accuracy levels of 99.55% and 99.01% for malware detection and malware categorization, respectively.

Highlights

  • The number of Android applications has been increasing rapidly owing to the significantly increased use of Android-based mobile devices

  • Haq et al [44] used a method for generating images based on entire APK files and applied a CNN model based on transfer learning for training, obtaining an accuracy level of 96.4%

  • Based on the concept of transfer learning, in this study, we focused on obtaining comment information not initially present in Android malware through migration

Read more

Summary

Introduction

The number of Android applications has been increasing rapidly owing to the significantly increased use of Android-based mobile devices. There is currently no study on transferring comment information from Android source code to the malicious code domain as program features for semantic enhancement through pretrained models. In this study, we propose Android-SEM, an Android source code semantic enhancement model based on transfer learning. To the best of our knowledge, we are the first to use feature fusion vectors obtained from comments and source code to enhance the semantic information contained in malware. To the best of our knowledge, we are the first to develop a model that uses QSVM combined with classical deep learning to detect malicious code in Android-based applications, and Android-SEM is more accurate than other detection models proposed in recent studies

Android Malware Detection and Classification
Source Code Comment Generation
Quantum Machine Learning
Method
Feature Engineering
Model Design
Quantum Classifier
Datasets
Experimental Setup
Implementation
Model Classification Capabilities
Model Comparison Experiment
Quantum Classifiers with Different Encoding Methods
Classification Score
Conclusions and Future Work
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.