Abstract

With the thriving of the Android ecosystem, codes are widely reused in Android apps in the form of third-party libraries. Recent research shows that emerging third-party libraries may introduce a lot of privacy risks and other security threats. Nevertheless, current approaches on libraries identification are far away from the demand for accuracy and efficiency. In this paper, we present LibHawkeye, a \jice{new} clustering-based technique to identify third-party libraries in millions of Android apps. Our approach utilizes four different kinds of dependencies inside Android apps to build intra-app dependency graphs but discards package homogeny which is heavily depended upon by most previous works. What's more, we propose three steps of refinement to eliminate false positives in the initial result as much as possible. The experiment on 1,000 apps reports that compared to existing tools, LibHawkeye can precisely identify at least 26.5\% more libraries. We also evaluate it with 3,987,206 Android apps published in Google Play, and the accuracy of sampled libraries from the clustering result is 93.25\%. Results show that LibHawkeye significantly outperforms the state-of-the-art tools without loss of scalability.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.