Abstract

Malicious software (“malware”) has become one of the serious cybersecurity issues in Android ecosystem. Given the fast evolution of Android malware releases, it is practically not feasible to manually detect malware apps in the Android ecosystem. As a result, machine learning has become a fledgling approach for malware detection. Since machine learning performance is largely influenced by the availability of high quality and relevant features, feature selection approaches play key role in machine learning based detection of malware. In this paper, we formulate the feature selection problem as a quadratic programming problem and analyse how commonly used filter-based feature selection methods work with emphases on Android malware detection. We compare and contrast several feature selection methods along several factors including the composition of relevant features selected. We empirically evaluate the predictive accuracy of the feature subset selection algorithms and compare their predictive accuracy and the execution time using several learning algorithms. The results of the experiments confirm that feature selection is necessary for improving accuracy of the learning models as well decreasing the run time. The results also show that the performance of the feature selection algorithms vary from one learning algorithm to another and no one feature selection approach performs better than the other approaches all the time.

Highlights

  • The Internet of Things (IoTs) has come to permeate all aspects of our life and IoT devices such as smartphones and smartwatches have become necessary in modern mobilecentric connected world

  • Google tries to weed out malware-infected apps from its market, the Google Play Store occasionally hosts malicious apps estimated to be up to 22% of the apps uploaded on the Google Play Store [6]

  • Feature selection algorithms select a subset of features from the original feature set, which are considered useful for training the learning models to obtain good results [2,10]

Read more

Summary

Introduction

The Internet of Things (IoTs) has come to permeate all aspects of our life and IoT devices such as smartphones and smartwatches have become necessary in modern mobilecentric connected world. Feature selection algorithms select a subset of features from the original feature set, which are considered useful for training the learning models to obtain good results [2,10]. Research on the usefulness of the state-of-the-art feature subset selection methods in the context of Android malware detection models have not received the attention it deserves [10]. To this end, we investigate the utility of the commonly used feature subset selection approaches for malware detection in Android platforms. We formulate the feature selection problem as a quadratic programming problem; and analyse how different feature selection methods work and how they are used in Android malware detection models,.

Problem Overview
Related Work
CLASSIFIER
Feature Vectors
Feature Subset Selection Methods
Pearson Correlation Coefficient
Chi-Square
Information Gain
Mutual Information
Experimental Setup and the Dataset
Performance Metrics
Malware Classification Models
Cross-Validation
Findings
Feature Ranking Analysis
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.