Privacy policies are verbose, difficult to understand, take too long to read, and may be the least-read items on most websites even as users express growing concerns about information collection practices. For all their faults, though, privacy policies remain the single most important source of information for users to attempt to learn how companies collect, use, and share data. Likewise, these policies form the basis for the selfregulatory notice and choice framework that is designed and promoted as a replacement for regulation. The underlying value and legitimacy of notice and choice depends, however, on the ability of users to understand privacy policies. This paper investigates the differences in interpretation among expert, knowledgeable, and typical users and explores whether these groups can understand the practices described in privacy policies at a level sufficient to support rational decision-making. This paper seeks © 2015 Joel R. Reidenberg, Travis Breaux, Lorrie Faith Cranor, Brian French, Amanda Grannis, James T. Graves, Fei Liu, Aleecia McDonald, Thomas B. Norton, Rohan Ramanath, N. Cameron Russell, Norman Sadeh and Florian Schaub. † For their comments on this study, the authors would like to acknowledge and thank Alessandro Acquisti, Noah A. Smith, and Shomir Wilson, and the participants at the 2014 TPRC 42nd Research Conference on Communication, Information and Internet Policy. Funding for this project was provided, in part, by the National Science Foundation under its Secure and Trustworthy Computing (SaTC) initiative grants 1330596, 1330214, and 1330141 for “TWC SBE: Option: Frontier: Collaborative: Towards Effective Web Privacy Notice and Choice: A Multi-Disciplinary Prospective” and by a Fordham Law School Faculty Research Grant. †† Respectively, Stanley D. and Nikki Waxberg Chair and Professor of Law, Fordham University; Assistant Professor of Computer Science, Carnegie Mellon University; Professor of Computer Science and Engineering & Public Policy, Carnegie Mellon University: Senior Research Programmer, Carnegie Mellon University; Research Fellow, Fordham Center on Law and Information Policy; Ph.D Candidate (Engineering and Public Policy) Carnegie Mellon University; Ph.D Candidate (Computer Science), Carnegie Mellon University; Director of Privacy, Stanford Center for Internet & Society; Privacy Fellow, Fordham Center on Law and Information Policy; Masters Candidate (Computer Science), Carnegie Mellon University; Executive Director, Fordham Center on Law and Information Policy; Professor of Computer Science, Carnegie Mellon University; Postdoctoral Fellow (Computer Science), Carnegie Mellon University. 40 BERKELEY TECHNOLOGY LAW JOURNAL [Vol. 30:1 to fill an important gap in the understanding of privacy policies through primary research on user interpretation and to inform the development of technologies combining natural language processing, machine learning, and crowdsourcing for policy interpretation and summarization. For this research, we recruited a group of law and public policy graduate students at Fordham University, Carnegie Mellon University, and the University of Pittsburgh (“knowledgeable users”) and presented these law and policy researchers with a set of privacy policies from companies in the e-commerce and news and entertainment industries. We asked them nine basic questions about the policies’ statements regarding data collection, data use, and retention. We then presented the same set of policies to a group of privacy experts and to a group of crowd workers representing typical Internet users. The findings show areas of common understanding across all groups for certain data collection and deletion practices, but also demonstrate very important discrepancies in the interpretation of privacy policy language, particularly with respect to data sharing. The discordant interpretations arose both within groups and between the experts and the two other groups. The presence of these significant discrepancies has critical implications. First, the common understandings of some attributes of described data practices mean that semiautomated extraction of meaning from website privacy policies may be able to assist typical users and improve the effectiveness of notice by conveying the true meaning of these policies. However, the disagreements among experts and disagreement between experts and the other groups reflect that ambiguous wording in typical privacy policies undermines the ability of privacy policies to effectively convey notice of data practices to the general public. The results of this research will, consequently, have significant policy implications for the construction of the notice and choice framework and for the U.S. reliance on this approach. The gap in interpretation indicates that privacy policies may be misleading the general public and that those policies could be considered legally unfair and deceptive. And, where websites are not effectively conveying privacy policies to consumers in a way that a “reasonable person” could, in fact, understand the policies, “notice and choice” fails as a framework. Such a failure has broad international implications since websites extend their reach beyond the United States.
Read full abstract