Abstract

Data brokers such as Acxiom and Experian are in the business of collecting and selling data on people; the data they sell is commonly used to feed marketing as well as political campaigns. Despite the ongoing privacy debate, there is still very limited visibility into data collection by data brokers. Recently, however, online advertising services such as Facebook have begun to partner with data brokers-to add additional targeting features to their platform- providing avenues to gain insight into data broker information. In this paper, we leverage the Facebook advertising system-and their partnership with six data brokers across seven countries-in order to gain insight into the extent and accuracy of data collection by data brokers today. We find that a surprisingly large percentage of Facebook accounts (e.g., above 90% in the U.S.) are successfully linked to data broker information. Moreover, by running controlled ads to 183 crowdsourced U.S.-based volunteers, we find that at least 40% of data broker sourced user attributes are not at all accurate, that users can have widely varying fractions of inaccurate attributes, and that even important information such as financial information can have a high degree of inaccuracy. Overall, this paper provides the first fine-grained look into the extent and accuracy of data collection by offline data brokers, helping to inform the ongoing privacy debate.

Highlights

  • Data brokers such as Acxiom [5] and Experian [23] have traditionally collected, aggregated, and linked information about people’s activities, based on a variety of sources

  • Data brokers and online services have begun partnering together, allowing for the data collected about users online to be linked against data collected offline

  • This enables online services to provide advertisers with targeting features that concern users’ offline information

Read more

Summary

INTRODUCTION

Data brokers such as Acxiom [5] and Experian [23] have traditionally collected, aggregated, and linked information about people’s activities, based on a variety of sources (e.g., voter records, vehicle registries, loyalty cards, and so forth). Online services such as Facebook and Google have been collecting information about people’s online activities Their business model is to build advertising platforms that use this data to provide advertisers with fine-grained targeting features [29, 31]. Data brokers and online services have begun partnering together, allowing for the data collected about users online to be linked against data collected offline This enables online services to provide advertisers with targeting features that concern users’ offline information (e.g., advertisers can target users based on their net worth, purchase behavior, and so forth [4, 34]). Our results present the first detailed look into the coverage and accuracy data broker ecosystem; our methodology could be used to help trace the provenance of collected data, as well as to study how data collection practices change over time

Offline data brokers
Facebook’s advertising platform
Related work
Methodology for studying coverage
Limitations
Analysis of coverage
Methodology for studying accuracy
Broker
User recruitment
Analysis of accuracy
Spending methods
Findings
DISCUSSION
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call