Fuzzy–synthetic minority oversampling technique: Oversampling based on fuzzy set theory for Android malware detection in imbalanced datasets

Yanping Xu,Kangfeng Zheng,Xinxin Niu,Chunhua Wu,Yixian Yang

doi:10.1177/1550147717703116

Abstract

In previous work, imbalanced datasets composed of more benign samples (the majority class) than the malicious one (the minority class) have been widely adopted in Android malware detection. These imbalanced datasets bias learning toward the majority class, so that the minority class examples are more likely to be misclassified. To solve the problem, we propose a new oversampling method called fuzzy–synthetic minority oversampling technique, which is based on fuzzy set theory and the synthetic minority oversampling technique method. As the sample size of the majority class increases relative to that of the minority class, fuzzy–synthetic minority oversampling technique generates more synthetic examples for each minority class examples in the fuzzy region, where the minority examples have a low degree of membership to the minority class and are more likely to be misclassified. Using the new synthetic examples, the classifiers build larger decision regions that contain more minority examples, and they are no longer biased to the majority class. Compared with synthetic minority oversampling technique and Borderline–synthetic minority oversampling technique methods, fuzzy–synthetic minority oversampling technique achieves higher accuracy on both the minority class and the entire datasets.

Highlights

Malware continues to increase with the rapid development of mobile networks, especially on the Android platform and services
Fuzzy set theory provides a methodology for data analysis; here, we extend fuzzy set theory to the task of Android malware detection in imbalanced datasets
In the experiments described in the previous subsection, we showed that combining fuzzySMOTE and support vector machines (SVMs) is better than SVM combined with other oversampling methods

Summary

Introduction

Malware (malicious software) continues to increase with the rapid development of mobile networks, especially on the Android platform and services. Android malware can infect and harm Android platforms and services through various methods such as malicious websites, spam, malicious SMS messages, and malware-bearing advertisements. There are two effective types of approaches: static analysis by decompiling the source code and dynamic analysis by monitoring application execution at runtime.[3] The datasets for most Android malware detection experiments are composed of both benign and malicious applications. Because it is difficult to create a comprehensive malware collection, the datasets used in the experiments are typically imbalanced: number of benign applications is larger than the number of malware applications.[4,5,6] this imbalance problem is usually not considered seriously

Objectives

Methods

Findings

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Distributed Sensor Networks	Publication Date: Apr 1, 2017
Citations: 14	License type: cc-by

R Discovery Prime

R Discovery Prime

Fuzzy–synthetic minority oversampling technique: Oversampling based on fuzzy set theory for Android malware detection in imbalanced datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Distributed Sensor Networks

Lead the way for us

Similar Papers

CLASSIFICATION BOOSTING IN IMBALANCED DATA
Sinta Septi Pangastuti ... Wahyuni Suryaningtyas
Malaysian Journal of Science | VOL. 38
Sinta Septi Pangastuti, et. al.Sinta Septi Pangastuti ... Wahyuni Suryaningtyas
30 Sep 2019
Malaysian Journal of Science | VOL. 38

PF-SMOTE: A novel parameter-free SMOTE for imbalanced datasets
Qiong Chen ... Xing-Gang Luo
Neurocomputing | VOL. 498
Qiong Chen, et. al.Qiong Chen ... Xing-Gang Luo
11 May 2022
Neurocomputing | VOL. 498

SMOTE: Synthetic Minority Over-sampling Technique
N V Chawla ... K W Bowyer
Journal of Artificial Intelligence Research | VOL. 16
N V Chawla, et. al.N V Chawla ... K W Bowyer
01 Jun 2002
Journal of Artificial Intelligence Research | VOL. 16

Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning
Hui Han ... Wen-Yuan Wang
-
Hui Han, et. al.Hui Han ... Wen-Yuan Wang
01 Jan 2004
01 Jan 2004

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fuzzy–synthetic minority oversampling technique: Oversampling based on fuzzy set theory for Android malware detection in imbalanced datasets

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Distributed Sensor Networks