KronoDroid: Time-based Hybrid-featured Dataset for Effective Android Malware Detection and Characterization

Alejandro Guerra-Manzanares,Hayretdin Bahsi,Sven Nõmm

doi:10.1016/j.cose.2021.102399

Alejandro Guerra-Manzanares, Hayretdin Bahsi + Show 1 more

Open Access

https://doi.org/10.1016/j.cose.2021.102399

Copy DOI

Journal: Computers & Security	Publication Date: Jul 9, 2021
Citations: 54	License type: cc-by-nc-nd

Affiliation: Tallinn University of Technology

Abstract

Android malware evolution has been neglected by the available data sets, thus providing a static snapshot of a non-stationary phenomenon. The impact of the time variable has not had the deserved attention by the Android malware research, omitting its degenerative impact on the performance of machine learning-based classifiers (i.e., concept drift). Besides, the sources of dynamic data and their particularities have been overlooked (i.e., real devices and emulators). Critical factors to take into account when aiming to build more effective, robust, and long-lasting Android malware detection systems. In this research, different sources of benign and malware data are merged, generating a data set encompassing a larger time frame and 489 static and dynamic features are collected. The particularities of the source of the dynamic features (i.e., system calls) are attended using an emulator and a real device, thus generating two equally featured sub-datasets. The main outcome of this research is a novel, labeled, and hybrid-featured Android dataset that provides timestamps for each data sample, covering all years of Android history, from 2008-2020, and considering the distinct dynamic data sources. The emulator data set is composed of 28,745 malicious apps from 209 malware families and 35,246 benign samples. The real device data set contains 41,382 malware, belonging to 240 malware families, and 36,755 benign apps. Made publicly available as KronoDroid, in a structured format, it is the largest hybrid-featured Android dataset and the only one providing timestamped data, considering dynamic sources’ particularities and including samples from over 209 Android malware families.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

KronoDroid: Time-based Hybrid-featured Dataset for Effective Android Malware Detection and Characterization

Abstract

Talk to us

Similar Papers

More From: Computers & Security

Lead the way for us

Similar Papers

Concept drift and cross-device behavior: Challenges and implications for effective android malware detection
Alejandro Guerra-Manzanares ... Hayretdin Bahsi
Computers & Security | VOL. 120
Alejandro Guerra-Manzanares, et. al.Alejandro Guerra-Manzanares ... Hayretdin Bahsi
19 May 2022
Computers & Security | VOL. 120

Cross-device behavioral consistency: Benchmarking and implications for effective android malware detection
Alejandro Guerra-Manzanares ... Martin Välbe
Machine Learning with Applications | VOL. 9
Alejandro Guerra-Manzanares, et. al.Alejandro Guerra-Manzanares ... Martin Välbe
18 Jun 2022
Machine Learning with Applications | VOL. 9

Android malware concept drift using system calls: Detection, characterization and challenges
Alejandro Guerra-Manzanares ... Hayretdin Bahsi
Expert Systems with Applications | VOL. 206
Alejandro Guerra-Manzanares, et. al.Alejandro Guerra-Manzanares ... Hayretdin Bahsi
21 Apr 2022
Expert Systems with Applications | VOL. 206

Detecting Android malware using sequences of system calls
Gerardo Canfora ... Francesco Mercaldo
-
Gerardo Canfora, et. al.Gerardo Canfora ... Francesco Mercaldo
31 Aug 2015
31 Aug 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

KronoDroid: Time-based Hybrid-featured Dataset for Effective Android Malware Detection and Characterization

Abstract

Talk to us

Similar Papers

More From: Computers &amp; Security

More From: Computers & Security