An efficient method for privacy-preserving trajectory data publishing based on data partitioning

Songyuan Li,Hui Tian,Hong Shen,Yingpeng Sang

doi:10.1007/s11227-019-02906-6

Abstract

Since Osman Abul et al. first proposed the k-anonymity-based privacy protection for trajectory data, researchers have proposed a variety of trajectory privacy-preserving methods. These methods mainly adopt a static anonymity algorithm, which only focusing on the trajectories in a specific time span, directly anonymizes data and publishes them without considering dynamic nature of trajectory data as the new time slice arriving. Furthermore, due to its correlation with time and position, the trajectory data is produced in large scale and many sensitive attributes; the traditional k-anonymity-based privacy-preserving models need to recalculate the last released trajectory data, which will increase the computing cost and reduce the availability of the released trajectories, are not fit for privacy protection in large-scale trajectory data. Therefore, this paper presents a method to dynamically publish the large-scale vehicle trajectory data with privacy protection under $$(k,\delta )$$ security constraints. According to the spatial and temporal characteristics of vehicle trajectory data, this paper first proposes a method to partition the trajectory data for storage and computation. We choose the sample point $$(x_{i},y_{i})$$ at time $$t_{i}$$ as partition points and store the partitions of the trajectory data according to the time sequence and location of the running vehicle. This results in the efficient trajectory scanning, clustering and privacy protection. We use $$(x_{i},y_{i},t_{m}-t_{n})$$ to represent the identifier of trajectory data to publish, use the generalize function to cluster trajectory data under the $$(k,\delta )$$ security constraints. Through this way, we can effectively process the trajectory in every data partition as time goes on and need not to recalculate the released trajectories, effectively reduce the computing cost. Through experiments on real trajectory data and Oldenburg trajectory data, confirming the data partitioning method in privacy-preserving large-scale trajectory data publishing under the security constraint of $$(k,\delta )$$, and the l-diversity. By the experimental comparison, our method maintains a least level of computing cost and higher data availability.

Full Text