Re-Identification Risk versus Data Utility for Aggregated Mobility Research Using Mobile Phone Location Data.

Ling Yin,Zhixiang Fang,Wei Wang,Shih-Lung Shaw,Ye Tao,Jinxing Hu,Qian Wang

doi:10.1371/journal.pone.0140589

Abstract

Mobile phone location data is a newly emerging data source of great potential to support human mobility research. However, recent studies have indicated that many users can be easily re-identified based on their unique activity patterns. Privacy protection procedures will usually change the original data and cause a loss of data utility for analysis purposes. Therefore, the need for detailed data for activity analysis while avoiding potential privacy risks presents a challenge. The aim of this study is to reveal the re-identification risks from a Chinese city’s mobile users and to examine the quantitative relationship between re-identification risk and data utility for an aggregated mobility analysis. The first step is to apply two reported attack models, the top N locations and the spatio-temporal points, to evaluate the re-identification risks in Shenzhen City, a metropolis in China. A spatial generalization approach to protecting privacy is then proposed and implemented, and spatially aggregated analysis is used to assess the loss of data utility after privacy protection. The results demonstrate that the re-identification risks in Shenzhen City are clearly different from those in regions reported in Western countries, which prove the spatial heterogeneity of re-identification risks in mobile phone location data. A uniform mathematical relationship has also been found between re-identification risk (x) and data (y) utility for both attack models: y = -ax b+c, (a, b, c>0; 0<x<1), where the exponent b increases with the background knowledge of the attackers. The discovered mathematical relationship provides data publishers with useful guidance on choosing the right tradeoff between privacy and utility. Overall, this study contributes to a better understanding of re-identification risks and a privacy-utility tradeoff benchmark for improving privacy protection when sharing detailed trajectory data.

Highlights

As a routine procedure, mobile phone operators collect users’ location data with certain sampling methods for billing, troubleshooting, or other technical measurement purposes
The MobiCom 2011 study proposed an attack model based on the top N locations and demonstrated that at the spatial granularity of the mobile sector or mobile cell level, the top two or three locations from mobile users’ trajectories yielded unique identifications of 10%–50% of individuals in the United States [15]
The Scientific Report 2013 study proposed an attack model based on spatial-temporal points and revealed that four randomly selected spatio-temporal points could uniquely identify 95% of the mobile users in a European country [16]

Summary

Introduction

Mobile phone operators collect users’ location data with certain sampling methods for billing, troubleshooting, or other technical measurement purposes. With mobile phone location data, a number of studies over the past few years have made progress in formulating universal human mobility patterns [2,3,4], predicting human mobility [5,6], estimating origin-destination (OD) flows [7,8,9], modeling human movement [10], revealing population dynamics and hot spots [11,12], identifying important activity places [13], and mining daily activity structures [14]. The Scientific Report 2013 study proposed an attack model based on spatial-temporal points and revealed that four randomly selected spatio-temporal points could uniquely identify 95% of the mobile users in a European country [16]

Objectives

Methods

Results

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLOS ONE	Publication Date: Oct 15, 2015
Citations: 24	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Re-Identification Risk versus Data Utility for Aggregated Mobility Research Using Mobile Phone Location Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE

Lead the way for us

Similar Papers

The Uncertain Geographic Context Problem in Identifying Activity Centers Using Mobile Phone Positioning Data and Point of Interest Data
Xingang Zhou ... Yang Yue
-
Xingang Zhou, et. al.Xingang Zhou ... Yang Yue
01 Jan 2015
01 Jan 2015

Privacy Protection Method for Cellular Signaling Data Based on Genetic Algorithm
Hua Chen ... Ming Cai
Journal of Transportation Engineering, Part A: Systems | VOL. 149
Hua Chen, et. al.Hua Chen ... Ming Cai
01 Apr 2023
Journal of Transportation Engineering, Part A: Systems | VOL. 149

Questioning the Limits of Genomic Privacy
Bartha M Knoppers ... J.J Nietfeld
The American Journal of Human Genetics | VOL. 91
Bartha M Knoppers, et. al.Bartha M Knoppers ... J.J Nietfeld
01 Sep 2012
The American Journal of Human Genetics | VOL. 91

Simulatable Binding: Beyond Simulatable Auditing
Lei Zhang ... Alexander Brodsky
-
Lei Zhang, et. al.Lei Zhang ... Alexander Brodsky
24 Aug 2008
24 Aug 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Re-Identification Risk versus Data Utility for Aggregated Mobility Research Using Mobile Phone Location Data.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLOS ONE