Abstract

Although Android becomes a leading operating system in market, Android users suffer from security threats due to malwares. To protect users from the threats, the solutions to detect and identify the malware variant are essential. However, modern malware evades existing solutions by applying code obfuscation and native code. To resolve this problem, we introduce an ensemble-based malware classification algorithm using malware family grouping. The proposed family grouping algorithm finds the optimal combination of families belonging to the same group while the total number of families is fixed to the optimal total number. It also adopts unified feature extraction technique for handling seamless both bytecode and native code. We propose a unique feature selection algorithm that improves classification performance and time simultaneously. 2-gram based features are generated from the instructions and segments, and then selected by using multiple filters to choose most effective features. Through extensive simulation with many obfuscated and native code malware applications, we confirm that it can classify malwares with high accuracy and short processing time. Most existing approaches failed to achieve classification speed and detection time simultaneously. Therefore, the approach can help Android users to keep themselves safe from various and evolving cyber-attacks very effectively.

Highlights

  • With over 2 billion active Android devices worldwide [1,2], Android is considered as the worldly most popular mobile operating system

  • Since the classification results depend on feature selection, we propose a very practical feature extraction and selection algorithm, and achieving a good performance with a large scale dataset

  • The main challenge for our proposed algorithm to increase the performance of malware classification is to determine that how many and which filter-method algorithms should be used

Read more

Summary

Introduction

With over 2 billion active Android devices worldwide [1,2], Android is considered as the worldly most popular mobile operating system. We propose an ensemble learning technique to classify malwares using results from different groups of malware families to achieve high scalability in terms of accuracy and processing time. High accurate group-based classification: This study introduces an efficient way to group the malware families by applying Boosted Random Forest (BRF) to a dataset and recursively combining the largest family with the smallest one Such a grouping technique greatly helps to increase the classification performance and reduce processing time. Robust malware classification: many existing works fail to detect or classify modern malwares using code obfuscation and native code, our algorithm provides classification results with high accuracy.

Related Work
Malware Detection
DroidNative
DroidSieve
Malware Classification
Ensemble Clustering and Classifier
RevealDroid
IagoDroid
Proposed Algorithm
System Overview
Feature Engineering
Family Grouping
Ensemble Model Generation
Classification Accuracy
Classification Time
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.