Abstract

With the increase of smart mobile devices in use, the number of malware targeting the mobile platforms has been increasing. As the major market player in the industry, Android OS has been the favourite target of perpetrators targeting mobile platforms. The current machine learning and deep learning approaches for android malware detection utilize various feature creation methods. The majority of these feature creation methods use frequency-based vectors created from different files present in the android application package (APK). These frequency-based feature creation methods fail to preserve the semantic information that is present in those files. In this paper we propose a method that utilises the static analysis and natural language processing (NLP) technique of document embeddings to generate feature vectors that can represent the information contained in android manifests and dalvik executables files present inside an APK. These embeddings are then used to train binary classifiers which can effectively differentiate between a benign or malicious android application. Our proposed method in the experiments has outperformed the other related works on the test datasets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call