APPCorp: a corpus for Android privacy policy document structure analysis

Shuang Liu,Renjie Guo,Fan Zhang,Tao Chen,Baiyang Zhao,Meishan Zhang

doi:10.1007/s11704-022-1627-2

Abstract

With the increasing popularity of mobile devices and the wide adoption of mobile Apps, an increasing concern of privacy issues is raised. Privacy policy is identified as a proper medium to indicate the legal terms, such as GDPR, and to bind legal agreement between service providers and users. However, privacy policies are usually long and vague for end users to read and understand. It is thus important to be able to automatically analyze the document structures of privacy policies to assist user understanding. In this work we create a manually labelled corpus containing $167$ privacy policies (of more than $447$K words and $5,276$ annotated paragraphs). We report the annotation process and details of the annotated corpus. We also benchmark our data corpus with $4$ document classification models, thoroughly analyze the results and discuss challenges and opportunities for the research committee to use the corpus. We release our labelled corpus as well as the classification models for public access.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

APPCorp: a corpus for Android privacy policy document structure analysis

Abstract

Talk to us

Similar Papers

More From: Frontiers of Computer Science

Lead the way for us

Journal: Frontiers of Computer Science	Publication Date: Sep 12, 2022
Citations: 3

Similar Papers

Semi-Automated Seeding of Personal Privacy Policies in E-Services
George Yee ... Larry Korba
-
George Yee, et. al.George Yee ... Larry Korba
01 Jan 2006
01 Jan 2006

Adapting Users' Privacy Preferences in Smart Environments
Md Zulfikar Alom ... Barbara Carminati
-
Md Zulfikar Alom, et. al.Md Zulfikar Alom ... Barbara Carminati
01 Jul 2019
01 Jul 2019

Secure semi‐automated GDPR compliance service with restrictive fine‐grained access control
Max Hashem Eiza ... Vinh Thong Ta
SECURITY AND PRIVACY | VOL. 7
Max Hashem Eiza, et. al.Max Hashem Eiza ... Vinh Thong Ta
14 Aug 2024
SECURITY AND PRIVACY | VOL. 7

Privacy Practices of Health Information Technologies: Privacy Policy Risk Assessment Study and Proposed Guidelines
Haley M Lamonica ... Grace Yeeun Lee
Journal of Medical Internet Research | VOL. 23
Haley M Lamonica, et. al.Haley M Lamonica ... Grace Yeeun Lee
16 Sep 2021
Journal of Medical Internet Research | VOL. 23

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

APPCorp: a corpus for Android privacy policy document structure analysis

Abstract

Talk to us

Similar Papers

More From: Frontiers of Computer Science