How Does Code Optimization Impact Third-party Library Detection for Android Applications?
Android applications (apps) widely use third-party libraries (TPLs) to reuse functionalities and simplify the development process. Unfortunately, these TPLs often suffer from vulnerabilities that attackers can exploit, leading to catastrophic consequences for app users. To mitigate this threat, researchers have developed tools to detect TPL versions in the app. If an app is found using a TPL vulnerable version, these tools will issue warnings. Although these tools claim to resist the effects of code obfuscation, our preliminary study indicates that code optimization is common during the app release process. A lack of consideration for the impact of code optimizations significantly reduces the effectiveness of existing tools. To fill this gap, this work systematically investigates how and to what extent different optimization strategies affect existing tools. Our findings have led to a new tool named LibHunter, designed to against major code optimization strategies (e.g., Inlining and CallSite Optimization) while also resisting code obfuscation and shrinking. Extensive evaluations on a dataset of apps with optimization, obfuscation, and shrinking enabled show LibHunter significantly outperforms existing tools. It achieves F1 value that surpass the best tools by 29.3% and 36.1% at the library and version levels, respectively. We also applied LibHunter to detect vulnerable TPLs in the top Google Play apps, which shows the scalability of our approach, as well as the potential of our approach to facilitate malware detection.
- Research Article
1
- 10.1088/1755-1315/428/1/012009
- Jan 1, 2020
- IOP Conference Series: Earth and Environmental Science
With the rapid development of the mobile market, the development of multi-functional applications is more efficient due to the rich functions provided by third-party libraries, so it is widely integrated into Android applications. Studies of Android third-party library security, such as hole digging, permissions, separation mechanism, the clone application test and safety test have different requirements to the third party libraries test accuracy and test center of gravity, making the Android third party libraries become a research hotspot. Besides, the detection algorithm of clustering algorithm is particularly important, therefore this article mainly research the Android third-party libraries clustering algorithm. This paper starts with the API call graph of the Android third-party library and combines the graph neural network GAT to design the similarity calculation and library clustering model of the Android third-party library. Firstly, the reverse tool was used to extract the API call diagram of the third-party Android library, and then the third-party Android library instance diagram was built based on the package dependency. Key API functions were selected to normalize the third-party Android library instance diagram, and then GAT and CNN were used as the similarity calculation model of the third-party Android library instance diagram to calculate the similarity. Finally, DBSCAN clustering algorithm is used to cluster the Android third-party library instance graph. Experimental results show that the method proposed in this paper can achieve 93% clustering accuracy and effectively cluster Android third-party libraries.
- Research Article
24
- 10.1109/tse.2018.2872958
- Sep 1, 2020
- IEEE Transactions on Software Engineering
With the thriving of mobile app markets, third-party libraries are pervasively used in Android applications. The libraries provide functionalities such as advertising, location, and social networking services, making app development much more productive. However, the spread of vulnerable and harmful third-party libraries can also hurt the mobile ecosystem, leading to various security problems. Therefore, third-party library identification has emerged as an important problem, being the basis of many security applications such as repackaging detection, vulnerability identification, and malware analysis. Previously, we proposed a novel approach to identifying third-party Android libraries at a massive scale. Our method uses the internal code dependencies of an app to recognize library candidates and further classify them. With a fine-grained feature hashing strategy, we can better handle code whose package and method names are obfuscated than historical work. We have developed a prototypical tool called LibD and evaluated it with an up-to-date dataset containing 1,427,395 Android apps. Our experiment results show that LibD outperforms existing tools in detecting multi-package third-party libraries with the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability. In this paper, we extend our early work by investigating the possibility of employing effective and scalable library detection to boost the performance of large-scale app analyses in the real world. We show that the technique of LibD can be used to accelerate whole-app Android vulnerability detection and quickly identify variants of vulnerable third-party libraries. This extension paper sheds light on the practical value of our previous research.
- Conference Article
188
- 10.1109/icse.2017.38
- May 1, 2017
With the thriving of the mobile app markets, third-party libraries are pervasively integrated in the Android applications. Third-party libraries provide functionality such as advertisements, location services, and social networking services, making multi-functional app development much more productive. However, the spread of vulnerable or harmful third-party libraries may also hurt the entire mobile ecosystem, leading to various security problems. The Android platform suffers severely from such problems due to the way its ecosystem is constructed and maintained. Therefore, third-party Android library identification has emerged as an important problem which is the basis of many security applications such as repackaging detection and malware analysis. According to our investigation, existing work on Android library detection still requires improvement in many aspects, including accuracy and obfuscation resilience. In response to these limitations, we propose a novel approach to identifying third-party Android libraries. Our method utilizes the internal code dependencies of an Android app to detect and classify library candidates. Different from most previous methods which classify detected library candidates based on similarity comparison, our method is based on feature hashing and can better handle code whose package and method names are obfuscated. Based on this approach, we have developed a prototypical tool called LibD and evaluated it with an update-to-date and large-scale dataset. Our experimental results on 1,427,395 apps show that compared to existing tools, LibD can better handle multi-package third-party libraries in the presence of name-based obfuscation, leading to significantly improved precision without the loss of scalability.
- Conference Article
1
- 10.1109/cac57257.2022.10054907
- Nov 25, 2022
Third-party libraries are widely used in APP development, providing technical solutions for APP rich functions, but some third-party libraries can collect many user privacy information and cause data leakage. According to our survey, about 60% of Android applications use android packing services. Existing tools cannot effectively detect third-party libraries used in packed applications, and their analysis results of third-party library privacy leaks are not comprehensive. To overcome these limitations, we improve the traditional third- party library detection tools so that it can detect third-party libraries used in packed applications. We propose a fine-grained privacy-collecting third-party library detection framework for detecting the privacy leakage of third-party libraries in Android applications by combining Androguard, Frida and improved third-party library detection tools. Our experimental results on 300 mainstream apps show that our framework provides good support for analyzing packed applications, and our approach can detect more third-party libraries and provide a more comprehensive analysis of privacy leaks of third-party libraries than existing tools.
- Conference Article
81
- 10.1109/icse43902.2021.00150
- May 1, 2021
Third-party libraries (TPLs) as essential parts in the mobile ecosystem have become one of the most significant contributors to the huge success of Android, which facilitate the fast development of Android applications. Detecting TPLs in Android apps is also important for downstream tasks, such as malware and repackaged apps identification. To identify in-app TPLs, we need to solve several challenges, such as TPL dependency, code obfuscation, precise version representation. Unfortunately, existing TPL detection tools have been proved that they have not solved these challenges very well, let alone specify the exact TPL versions. To this end, we propose a system, named ATVHunter, which can pinpoint the precise vulnerable in-app TPL versions and provide detailed information about the vulnerabilities and TPLs. We propose a two-phase detection approach to identify specific TPL versions. Specifically, we extract the Control Flow Graphs as the coarse-grained feature to match potential TPLs in the predefined TPL database, and then extract opcode in each basic block of CFG as the fine-grained feature to identify the exact TPL versions. We build a comprehensive TPL database (189,545 unique TPLs with 3,006,676 versions) as the reference database. Meanwhile, to identify the vulnerable in-app TPL versions, we also construct a comprehensive and known vulnerable TPL database containing 1,180 CVEs and 224 security bugs. Experimental results show ATVHunter outperforms state-of-the-art TPL detection tools, achieving 90.55% precision and 88.79% recall with high efficiency, and is also resilient to widely-used obfuscation techniques and scalable for large-scale TPL detection. Furthermore, to investigate the ecosystem of the vulnerable TPLs used by apps, we exploit ATVHunter to conduct a large-scale analysis on 104,446 apps and find that 9,050 apps include vulnerable TPL versions with 53,337 vulnerabilities and 7,480 security bugs, most of which are with high risks and are not recognized by app developers.
- Research Article
22
- 10.1016/j.cose.2018.07.024
- Oct 15, 2018
- Computers & Security
Securing android applications via edge assistant third-party library detection
- Book Chapter
- 10.1007/978-3-030-02744-5_5
- Jan 1, 2018
Nowadays, redundant permissions and probing permissions are common in Android applications and third-party libraries, which may cause massive security threats to their users. Existing tools used for permission analysis may introduce incorrect detection results, due to their regardless of the relationships between permissions and the values of function parameters and fields. In order to extract the exact used permissions in Android applications and third-party libraries, we propose a Dalvik register-based data flow analysis technique (DARFA) to get the parameter values of function parameters and fields. By leveraging DARFA, we design and implement PermHunter, a static analysis tool, to detect redundant permissions and probing permissions in Android apps and third-party libraries. We have evaluated PermHunter by analyzing 45 third-party libraries and 653 applications. These results indicate that nearly half of these third-party libraries have redundant permissions and probing permissions, and the proportions in Android applications are even higher.
- Conference Article
18
- 10.1109/apsec.2016.017
- Jan 1, 2016
Android applications typically contain multiple third-party libraries and recent studies have shown that the presence of third-party libraries may introduce privacy risks and security threats. Furthermore, researchers have reported the importance of considering the third-party libraries for their program analysis tasks. A reason being that the presence of third-party libraries may dilute the features and affect the accuracy of their results. Existing literature typically employs a whitelist to exclude the third-party libraries from their analysis in order to achieve accurate results. However, these whitelists are generally incomplete and weak against the renaming obfuscation technique that is commonly employed in Android applications. In this paper, we propose LibSift, a tool to automatically detect third-party libraries in Android applications. LibSift detects third-party libraries based on package dependencies that are resilient to most common obfuscations. The evaluation results not only indicate that LibSift can detect third-party libraries accurately and effectively, but also show that LibSift can detect even the less popular libraries that are not detected by two of the state-of-the-art approaches.
- Conference Article
4
- 10.1109/candarw.2018.00088
- Nov 1, 2018
A Third-Party Library(TPL) is often used in developing Android applications, however older TPLs may have vulnerabilities. Hence developers need to keep them in their applications the latest version. Nevertheless, there is a lot of applications using older TPLs. In this paper, we propose a new method which users enable to update TPLs in Android applications. An Android application and TPLs can be converted to smali file which is more of an assembly based language. A smali file can be replaced with another smali file on the same class. Our method takes advantage of its properties and exchanges a vulnerable TPL for an security fixed one. Moreover, we apply it to real applications and evaluate feasibility of it.
- Research Article
8
- 10.1109/tdsc.2021.3075817
- Sep 1, 2022
- IEEE Transactions on Dependable and Secure Computing
Android application (or app) developers increasingly integrate third-party libraries to enrich the functionality of their apps. However, current permission model on Android cannot constrain the behaviors of in-app third-party libraries for allowing them to operate with the same permissions as their host app. This brings serious security and privacy concerns to users. In this article, we propose <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">LibCapsule</small> , a user-level solution to confine third-party libraries from potential permission abuses. Compared to previous systems, <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">LibCapsule</small> is able to provide <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">complete</i> confinement of third-party libraries in Android apps, including the static Java code, dynamically loaded code and native code of third-party libraries. We have developed a prototype of <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">LibCapsule</small> , and collected 204 popular third-party libraries as well as 2,021 apps to evaluate it. The evaluation results indicate that <sc xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">LibCapsule</small> is capable of enforcing complete and fine-grained regulation on third-party libraries according to customized security policies with a low performance overhead. To engage the whole community, we will release the dataset of third-party libraries and apps in our evaluation.
- Conference Article
37
- 10.1145/3196494.3196538
- May 29, 2018
Recent research suggests that 88% of Android applications that use Java cryptographic APIs make at least one mistake, which results in an insecure implementation. It is unclear, however, if these mistakes originate from code written by application or third-party library developers. Understanding the responsible party for a misuse case is important for vulnerability disclosure. In this paper, we bridge this knowledge gap and introduce source attribution to the analysis of cryptographic API misuse. We developed BinSight, a static program analyzer that supports source attribution, and we analyzed 132K Android applications collected in years 2012, 2015, and 2016. Our results suggest that third-party libraries are the main source of cryptographic API misuse. In particular, 90% of the violating applications, which contain at least one call-site to Java cryptographic API, originate from libraries. When compared to 2012, we found the use of ECB mode for symmetric ciphers has significantly decreased in 2016, for both application and third-party library code. Unlike application code, however, third-party libraries have significantly increased their reliance on static encryption keys for symmetric ciphers and static IVs for CBC mode ciphers. Finally, we found that the insecure RC4 and DES ciphers were the second and the third most used ciphers in 2016.
- Research Article
- 10.21220/s2-0fsx-bc44
- Jan 1, 2014
- W&M Publish (College of William & Mary)
In the recent years, studies of design and programming practices in mobile development are gaining more attention from researchers. Several such empirical studies used Android applications (paid, free, and open source) to analyze factors such as size, quality, dependencies, reuse, and cloning. Most of the studies use executable files of the apps (APK files), instead of source code because of availability issues (most of free apps available at the Android official market are not open-source, but still can be downloaded and analyzed in APK format). However, using only APK files in empirical studies comes with some threats to the validity of the results. In this paper, we analyze some of these pertinent threats. In particular, we analyzed the impact of third-party libraries and code obfuscation practices on estimating the amount of reuse by class cloning in Android apps. When including and excluding third-party libraries from the analysis, we found statistically significant differences in the amount of class cloning 24,379 free Android apps. Also, we found some evidence that obfuscation is responsible for increasing a number of false positives when detecting class clones. Finally, based on our findings, we provide a list of actionable guidelines for mining and analyzing large repositories of Android applications and minimizing these threats to validity. While in our initial work we studied different factors that impact reuse in Android apps, we also designed and implemented an approach to help facilitate the enabling of reuse in Android mobile applications. Although mobile app stores may have a list of similar apps to present to the user, this list may not be complete and/or accurate. Detecting similar applications is a notoriously difficult problem, since it implies that similar highlevel requirements and their low-level implementations can be detected and matched automatically for different applications. We designed an approach for automatically detecting Closely reLated applications in ANdroid (CLANdroid), which helps detect similar Android applications based on a given Android mobile app. CLANdroid is an extension to a novel approach by CLAN, which is a previously published approach that is included in this thesis for completeness purposes. Our main contributions are an extension to a framework of relevance and a novel algorithm that computes a similarity index between Java and Android applications using the notion of semantic layers that correspond to packages and class hierarchies. To evaluate CLANdroid we extracted a goldset for each of the 14,450 apps in our dataset, which consisted of apps that were deemed similar based on the app's page on Google Play. We compared five different ranking methods: API calls, identifiers, intents, permissions, and phone sensors. The results show that when considering the whole dataset, the identifiers ranking method is most effective.
- Conference Article
2
- 10.1109/iccsnt.2016.8070145
- Dec 1, 2016
Many Android researches which focus on clone Android applications (or apps) and third-party libraries are both related to detect similar components between apps or between app and third-party library. We conclude a set of stable features of a class which will not change during obfuscation through studying the app's bytecode and ProGuard, the most popular obfuscator used in Android development. Based on the stable features, we proposed a program analysis approach to detect similar packages, classes or methods between apps. We generate app's fingerprints based on the stable features, and compare the fingerprints later. So our approach can work well with obfuscated apps. We evaluate our approach through manually written apps and evaluate the functionality through two groups of app. And the result shows that our approach can analyze a pair of apps within 2min, which is feasible to detect similar components between apps.
- Research Article
- 10.5753/jisa.2024.3882
- Jul 24, 2024
- Journal of Internet Services and Applications
The emergence of COVID-19 in 2019 had a profound international impact. Technologically, governments and significant organizations responded by spearheading the development of mobile applications to aid citizens in navigating the challenges posed by the pandemic. While many of these applications proved successful in their intended purpose, the safeguarding of user privacy was not consistently prioritized, revealing a prevalent use of third-party libraries commonly referred to as trackers. In our comprehensive analysis encompassing 595 Android applications, we uncovered trackers in 402 of them, leading to the inadvertent exposure of sensitive user information and device data on external servers. Our investigation delved into the methodologies employed by these trackers to harvest and exfiltrate information. Furthermore, we examined the positions adopted by both trackers and governments. This study underscores the critical need for a reevaluation of the inclusion of trackers in applications of such sensitivity. Recognizing the potential lack of awareness within the scrutinized organizations regarding the risks associated with integrating third-party libraries, particularly trackers, we introduce SAPITO as part of our contributions. SAPITO is an open-source tool designed to identify potential leaks of sensitive data by third-party libraries in Android applications, providing a valuable resource for enhancing the security and privacy measures of mobile applications in the face of evolving technological challenges.
- Research Article
- 10.1007/s10207-025-01105-0
- Jul 31, 2025
- International Journal of Information Security
Third-party payment libraries (TPLs) are widely used in Android applications to facilitate in-app transactions, yet their security implications remain largely underexplored. In this paper, we present a novel approach for automated detection and security analysis of payment libraries in Android applications. Our tool, PayScan, employs byte-pattern analysis and heuristic scanning techniques to identify TPLs and then it assesses their security posture. Additionally, the tool integrates three independent security scanners. We analyzed a dataset of 10,553 Android applications, detecting 18 payment libraries and evaluating their security and privacy risks. Our findings indicate that 71.7% of applications use outdated payment libraries, with some SDK versions being over four years old. Additionally, we identified 397 private key leaks across 212 applications. The security scanners detected over 20,000 vulnerabilities, including critical issues such as SSL misconfigurations, WebView XSS, and weak cryptographic implementations. We compare our detection approach against LibScout and LibRadar, demonstrating its practical performance in detecting payment libraries, including in obfuscated applications. This study reveals important security risks in mobile payment ecosystems and emphasizes the value of continued monitoring of third-party payment libraries. The proposed tool offers a scalable solution for detection and analysis, providing practical utility for researchers, developers, and auditors focused on financial application security.