Abstract

Emerging smart transportation applications are calling for publishing and sharing individual-based mobility trace data sets to researchers and practitioners; in the meanwhile, however, privacy issues have become a major concern given that true identities of individuals can be easily revealed from these data sets. Data synthesis In this paper, we quantitatively measure the risk of privacy disclosure in mobility trace data set caused by re-identification attacks based on the concept of k-anonymity. Using a one-month license plate recognition (LPR) data set collected in Guangzhou, China, we examine a variety of factors determining the degree of anonymity of an individual, including the temporal granularity and the size of the published data, local v.s. non-local vehicles, and continuous v.s. non-continuous observations. We find that five spatiotemporal records are enough to uniquely identify about 90% of individuals, even when the temporal granularity is set to be half a day. To publish LPR data without compromising privacy, we propose a suppression solution and a generalization solution and quantify the privacy-and-utility trade-off of them. Our results show that the suppression solution, which removes sensitive records, have a notable performance on privacy protection. The average individual anonymity identified by three spatiotemporal records increases by more than 20% at the cost of losing less than 8% of the data. We also propose a bintree-based adaptive time interval cloaking algorithm as a generalization solution. To meet a specific anonymity constraint, this algorithm adjusts the temporal resolution adaptively based on traffic counts under the principle of minimal information loss. We find that the generalization algorithm performs extremely well in satisfying different user-specified anonymity constraints and it is more flexible and reliable than the traditional uniform time interval cloaking method. We also find a strong correlation between the resulting temporal accuracy of data anonymized by the algorithm and the traffic condition. This study serves as a reminder to relevant agencies and data owners about the privacy vulnerability in individual-based mobility trace data sets and provides methodological guidance when publishing and sharing such sensitive data set.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call