Abstract

Severe privacy leakage in the AOL search log incident has attracted considerable worldwide attention. However, all the web users’ daily search intents and behavior are collected in such data, which can be invaluable for researchers, data analysts and law enforcement personnel to conduct social behavior study [14] , criminal investigation [5] and epidemics detection [10] . Thus, an important and challenging research problem is how to sanitize search logs with strong privacy guarantee and sufficiently retained utility. Existing approaches in search log sanitization are capable of only protecting the privacy under a rigorous standard [24] or maintaining good output utility [25] . To the best of our knowledge, there is little work that has perfectly resolved such tradeoff in the context of search logs, meeting a high standard of both requirements. In this paper, we propose a sanitization framework to tackle the above issue in a distributed manner. More specifically, our framework enables different parties to collaboratively generate search logs with boosted utility while satisfying Differential Privacy . In this scenario, two privacy-preserving objectives arise: first, the collaborative sanitization should satisfy differential privacy; second, the collaborative parties cannot learn any private information from each other. We present an efficient protocol – C ollaborative s E arch L og S anitization ( CELS ) to meet both privacy requirements. Besides security/privacy and cost analysis, we demonstrate the utility and efficiency of our approach with real data sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call