Abstract

In the recent years, studies of design and programming practices in mobile development are gaining more attention from researchers. Several such empirical studies used Android applications (paid, free, and open source) to analyze factors such as size, quality, dependencies, reuse, and cloning. Most of the studies use executable files of the apps (APK files), instead of source code because of availability issues (most of free apps available at the Android official market are not open-source, but still can be downloaded and analyzed in APK format). However, using only APK files in empirical studies comes with some threats to the validity of the results. In this paper, we analyze some of these pertinent threats. In particular, we analyzed the impact of third-party libraries and code obfuscation practices on estimating the amount of reuse by class cloning in Android apps. When including and excluding third-party libraries from the analysis, we found statistically significant differences in the amount of class cloning 24,379 free Android apps. Also, we found some evidence that obfuscation is responsible for increasing a number of false positives when detecting class clones. Finally, based on our findings, we provide a list of actionable guidelines for mining and analyzing large repositories of Android applications and minimizing these threats to validity. While in our initial work we studied different factors that impact reuse in Android apps, we also designed and implemented an approach to help facilitate the enabling of reuse in Android mobile applications. Although mobile app stores may have a list of similar apps to present to the user, this list may not be complete and/or accurate. Detecting similar applications is a notoriously difficult problem, since it implies that similar highlevel requirements and their low-level implementations can be detected and matched automatically for different applications. We designed an approach for automatically detecting Closely reLated applications in ANdroid (CLANdroid), which helps detect similar Android applications based on a given Android mobile app. CLANdroid is an extension to a novel approach by CLAN, which is a previously published approach that is included in this thesis for completeness purposes. Our main contributions are an extension to a framework of relevance and a novel algorithm that computes a similarity index between Java and Android applications using the notion of semantic layers that correspond to packages and class hierarchies. To evaluate CLANdroid we extracted a goldset for each of the 14,450 apps in our dataset, which consisted of apps that were deemed similar based on the app's page on Google Play. We compared five different ranking methods: API calls, identifiers, intents, permissions, and phone sensors. The results show that when considering the whole dataset, the identifiers ranking method is most effective.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call