M2VMapper: Malware-to-Vulnerability mapping for Android using text processing

Shivi Garg,Niyati Baliyan

doi:10.1016/j.eswa.2021.116360

Abstract

Over 90% of the mobile malware target Android mobile platform. Many Machine Learning (ML) and Deep Learning (DL) techniques have been used to detect and analyze Android malware, however, there is a many-to-many mapping between malware and vulnerability. This means a single malware can exploit multiple security vulnerabilities (known or unknown) and a single vulnerability can be exploited by multiple malware. Therefore, it is important to analyze the behaviour of malware to identify and reduce the vulnerabilities. Till date, no ML/DL or other technique has been deployed to analyze the malware behaviour to identify and reduce the vulnerabilities. The paper proposes a DL framework ‘M2VMapper’ that combines transfer learning and pretrained language models, which aims to map malware and potential vulnerabilities using a 2D matrix. The many-to-many mapping matrix is obtained by using transformer models such as BERT and XLNET; in addition to DL models such as Multi-layer Perceptron (MLP), Recurrent Neural Network (RNN) and Textual Convolutional Neural Network (TextCNN). This malware-to-vulnerability mapping can be leveraged to measure the severity of unknown vulnerabilities and malware during the initial phase of application development. The study is a first of its kind and considers 150 malware families from different datasets, such as AMD, CICInvesAndMal2019, and Androzoo with a total of 48 907 malware samples and 9 vulnerability types affecting Android. M2VMapper has delivered highly promising results with an accuracy of 99.81%, when XLNET is used with TextCNN, and precision and F1-scores above 95% using DL models.

Full Text