A Graph-Based Feature Generation Approach in Android Malware Detection with Machine Learning Techniques

Xiaojian Liu,Kehong Liu,Qian Lei

doi:10.1155/2020/3842094

Abstract

An explosive spread of Android malware causes a serious concern for Android application security. One of the solutions to detecting malicious payloads sneaking in an application is to treat the detection as a binary classification problem, which can be effectively tackled with traditional machine learning techniques. The key factors in detecting Android malware with machine learning techniques are feature selection and generation. Most of the existing approaches select and generate features without fully examining the structures of programs, and thus the important semantic information associated with these features is lost, consequently resulting in a low accuracy rate in detection. To address this issue, we propose a new feature generation approach for Android applications, which takes components and program structures into consideration and extracts features in a graph-based and semantics-rich style. This approach highlights two major distinguishing aspects: the context-based feature selection and graph-based feature generation. We abstract an Android application as a collection of reduced iCFGs (interprocedural control flow graphs) and extract original features from these graphs. Combining the original features and their contexts together, we generate new features which hold richer semantic information than the original ones. By embedding the features into a feature vector space, we can use machine learning techniques to train a malware detector. The experiment results show that this approach achieves an accuracy rate of 95.4% and a recall rate of 96.5%, which prove the effectiveness and advantages of our approach.

Highlights

Android system, as one of the most popular mobile platforms, faces various serious security challenges due to its open-source characteristics, imperfect permission mechanisms, and the absence of full certification of applications at their publications
Machine learning techniques have been widely used in malware detection [1,2,3,4,5,6,7,8]. is kind of approach treats malware detection as a binary classification problem, which can be tackled with the traditional techniques in pattern recognition or machine learning disciplines
(4) we combine the raw features with their contexts to form the new features, which hold richer semantic information than individual raw features, and using this feature set to classify an application can achieve a better detection performance

Summary

Introduction

As one of the most popular mobile platforms, faces various serious security challenges due to its open-source characteristics, imperfect permission mechanisms, and the absence of full certification of applications at their publications. Is kind of approach treats malware detection as a binary classification problem (i.e., differentiate an application as malicious or benign), which can be tackled with the traditional techniques in pattern recognition or machine learning disciplines Since this approach does not fully investigate the semantics and all details of programs, it achieves a better performance over traditional dynamic approaches [1, 2, 9] and static approaches [4,5,6, 10,11,12,13,14,15] in terms of scalability and time consumption. (1) We propose a context-based feature selection approach, which combines the three kinds of raw features with their contexts to serve as newly generated features Since these features hold rich semantic information about program behaviors, we achieve a better result than the traditional machine learning-based approaches.

Feature Selection

Feature Generation

Feature Transformation

Feature Function for Bigrams

Related Works

90 Accuarcy

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Mathematical Problems in Engineering	Publication Date: May 27, 2020
Citations: 4	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

A Graph-Based Feature Generation Approach in Android Malware Detection with Machine Learning Techniques

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering

Lead the way for us

Similar Papers

Android malware detection: state of the art
Sunil Kumar Muttoo ... Shikha Badhani
International Journal of Information Technology | VOL. 9
Sunil Kumar Muttoo, et. al.Sunil Kumar Muttoo ... Shikha Badhani
22 Feb 2017
International Journal of Information Technology | VOL. 9

The Concept of Attack Scenarios and Its Applications in Android Malware Detection
Yu-Chen Chang ... Sheng-De Wang
-
Yu-Chen Chang, et. al.Yu-Chen Chang ... Sheng-De Wang
01 Dec 2016
01 Dec 2016

Android Malware Detection through Machine Learning Techniques: A Review
Abikoye Oluwakemi Christiana ... Akande Noah
International Journal of Online and Biomedical Engineering (iJOE) | VOL. 16
Abikoye Oluwakemi Christiana, et. al.Abikoye Oluwakemi Christiana ... Akande Noah
12 Feb 2020
International Journal of Online and Biomedical Engineering (iJOE) | VOL. 16

Mass Discovery of Android Malware Behavioral Characteristics for Detection Consideration
Xin Su ... Jiuchuan Lin
-
Xin Su, et. al.Xin Su ... Jiuchuan Lin
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Graph-Based Feature Generation Approach in Android Malware Detection with Machine Learning Techniques

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Mathematical Problems in Engineering