PermPress: Machine Learning-Based Pipeline to Evaluate Permissions in App Privacy Policies

Muhammad Sajidur Rahman,Vincent Bindschaedler,Pirouz Naghavi,Byron Williams,Sadia Afroz,Blas Kojusner,Sara Rampazzi

doi:10.1109/access.2022.3199882

Muhammad Sajidur Rahman, Vincent Bindschaedler + Show 5 more

Open Access

https://doi.org/10.1109/access.2022.3199882

Copy DOI

Abstract

Privacy laws and app stores (e.g., Google Play Store) require mobile apps to have transparent privacy policies to disclose sensitive actions and data collection, such as accessing the phonebook, camera, storage, GPS, and microphone. However, many mobile apps do not accurately disclose their sensitive data access that requires sensitive (’dangerous’) permissions. Thus, analyzing discrepancies between apps’ permissions and privacy policies facilitates the identification of compliance issues upon which privacy regulators and marketplace operators can act. This paper proposes PermPress – an automated machine-learning system to evaluate an Android app’s permission-completeness, i.e., whether its privacy policy matches its dangerous permissions. PermPress combines machine learning techniques with human annotation of privacy policies to establish whether app policies contain permission-relevant information. PermPress leverages MPP-270, an annotated policy corpus, for establishing a gold standard dataset of permission completeness. This corpus shows that only 31% of apps disclose all dangerous permissions in privacy policies. By leveraging the annotated dataset and machine learning techniques, PermPress achieves an AUC score of 0.92 in predicting the permission-completeness of apps. A large-scale evaluation of 164, 156 Android apps shows that, on average, 7% of apps do not disclose more than half of their declared dangerous permissions in privacy policies, whereas 60% of apps omit to disclose at least one dangerous permission-related data collection in privacy policies. This paper’s investigation uncovers the non-transparent state of app privacy policies and highlights the need to standardize app privacy policies’ compliance and completeness checking process.

Full Text