Web Privacy: A Formal Adversarial Model for Query Obfuscation

Florimond Houssiau,Thibaut Liénart,Yves-Alexandre De Montjoye,Julien Hendrickx

doi:10.1109/tifs.2023.3262123

Abstract

The queries we perform, the searches we make, and the websites we visit – this sensitive data is collected at scale by companies as part of the services they provide. Query obfuscation, intertwining the genuine queries of the user with artificial queries, has been proposed as a solution to protect the privacy of individuals on the web.We here present a formal model and formulate through attack models three privacy requirements for obfuscators: (1) <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">indistinguishability</i> , that the user query should be hard to identify; (2) <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">coverage</i> , that its topic should be hard to identify; and (3) <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">imprecision</i> , that the query should still be hard to identify for an attacker with additional auxiliary information. The latter is needed to make the former two guarantees “future-proof”. Using our framework, we derive two important results for obfuscators. First, we show that indistinguishability imposes strong bounds on the coverage and imprecision achievable by an obfuscator. Second, we prove an important tradeoff between coverage and imprecision, which inherently limits the strength and robustness of the privacy guarantees that an obfuscator can provide. We then introduce a family of obfuscators with provable indistinguishability guarantees, which we call <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">k</i> –ball obfuscators, and show, for a range of parameter values, the achievable coverage and imprecision. We show empirically that our theoretical tradeoff holds, and that its bound is not tight in practice: even in a simple idealized setting, there is a significant gap between practical coverage and imprecision guarantees, and the optimal bounds. While obfuscators have proven popular with the general public, all obfuscators currently available provide adhoc guarantees, and have been shown to be vulnerable to attacks, putting the data of users at risk. We hope this work to be a first step towards a robust evaluation of the properties of query obfuscators and the development of principled obfuscators.

Full Text