Abstract

The correct classifying and filtering of common libraries in Android applications can effectively improve the accuracy of repackaged application detection. However, the existing common library detection methods barely meet the requirement of large-scale app markets due to the low detection speed caused by their classification rules. Aiming at this problem, a structural similarity based common library detection method for Android is presented. The sub-packages with weak association to main package are extracted as common library candidates from the decompiled APK (Android application package) by using PDG (program dependency graph) method. With package structures and API calls being used as features, the classifying of those candidates is accomplished through coarse and fine-grained filtering. The experimental results by using real-world applications as dataset show that the detection speed of the present method is higher while the accuracy and false positive rate are both ensured. The method is proved to be efficient and precise.

Highlights

  • The correct classifying and filtering of common libraries in Android applications can effectively improve the accuracy of repackaged application detection

  • the existing common library detection methods barely meet the requirement of large⁃scale app markets

  • to the low detection speed caused by their classification rules

Read more

Summary

Introduction

基于多种目的( 提高开发效率、提供广告等), 大量 Android 应用会在开发过程中使用公用代码库。 绝大部分重打包检测方法[4⁃5] 在检测前尝试排 除公用代码,降低这部分代码对检测结果的影响。 但在当前应用程序不断涌现、公用代码库总量不断 提高的背景下,现有公用库检测方法的效率已无法 满足检测的需求。 已有一些方法[4⁃5] 使用白名单的方式将应用中 的公用代码文件过滤排除,由此提升了检测效率与 效果。 但该方式存在 2 点不足:1更新不及时,需要 大量人工参与;2使用包名作为过滤条件,恶意应用 可通过修改包名轻易规避。 Xml 配置文件中获取权限等特征,通过这些提取的特征 检测应用程序是否重打包。 但该方法提取的特征相 对比较粗粒度,容易出现较高的误判。 SimiDroid 提 取基于方法、应用程序组件、资源文件的特征,通过 这些特征不仅分析应用程序之间的相似度,还对检 测结果进行了分析,取得了较好的检测效果[7] 。 文 献[8] 提取应用程序调用的关键 APIs,利用图神经 网络和聚类算法对图结构数据进行处理,为 Android 第三方库的聚类分析提供了一种新的思路。 公用库检测的核心评估指标主要为:1检出率, 准确发现应用中的公用库;2速度,快速检测百万量 级应用。 本文提出的检测方法面向大规模量级应用 环境,考虑 Android 应用安装包的特性,分析代码包 的目录与文件分布结构,过滤结构不相似的代码包, 使用代码文件的 API 特征进行细粒度比较,完成公 用库包的分类。

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call