Abstract

We develop a flexible methodology to protect marketing data in the context of a business ecosystem in which data providers seek to meet the information needs of data users, but wish to deter invalid use of the data by potential intruders. In this context we propose a Bayesian probability model that produces protected synthetic data. A key feature of our proposed method is that the data provider can balance the trade-off between information loss resulting from data protection and risk of disclosure to intruders. We apply our methodology to the problem facing a vendor of retail point-of-sale data whose customers use the data to estimate price elasticities and promotion effects. At the same time, the data provider wishes to protect the identities of sample stores from possible intrusion. We define metrics to measure the average and maximum loss of protection implied by a data protection method. We show that, by enabling the data provider to choose the degree of protection to infuse into the synthetic data, our method performs well relative to seven benchmark data protection methods, including the extant approach of aggregating data across stores. Data are available at https://doi.org/10.1287/mksc.2017.1064 .

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call