Abstract

Apple introduced privacy labels in Dec. 2020 as a way for developers to report the privacy behaviors of their apps. While Apple does not validate labels, they also require developers to provide a privacy policy, which offers an important comparison point. In this paper, we fine-tuned BERT-based language models to extract privacy policy features for 474,669 apps on the iOS App Store, comparing the output to the privacy labels. We identify discrepancies between the policies and the labels, particularly as they relate to data collected linked to users. We find that 228K apps' privacy policies may indicate data collection linked to users than what is reported in the privacy labels. More alarming, a large number (97%) of the apps with a Data Not Collected privacy label have a privacy policy indicating otherwise. We provide insights into potential sources for discrepancies, including the use of templates and confusion around Apple's definitions and requirements. These results suggest that significant work is still needed to help developers more accurately label their apps. Our system can be incorporated as a first-order check to inform developers when privacy labels are possibly misapplied.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.