Data minimization is a legal principle that mandates limiting the collection of personal data to a necessary minimum. In this context, we address ourselves to pervasive mobile-to-mobile recommender systems in which users establish ad hoc wireless connections between their mobile computing devices in physical proximity to exchange ratings that represent personal data on which they calculate recommendations. The specific problem is: How can users minimize the collection of ratings over all users while only being able to communicate with a subset of other users in physical proximity? A main difficulty is the mobility of users, which prevents, for instance, the creation and use of an overlay network to coordinate data collection. Users, therefore, have to decide whether to exchange ratings and how many when an ad hoc wireless connection is established. We model the randomness of these connections and apply an algorithm based on distributed gradient descent to solve the distributed data minimization problem at hand. We show that the algorithm robustly produces the least amount of connections and also the least amount of collected ratings compared to an array of baselines. We find that this simultaneously reduces the chances of an attacker relating users to ratings. In this sense, the algorithm also preserves the anonymity of users, yet only of those users who do not establish an ad hoc wireless connection with each other. Users who do establish a connection with each other are trivially not anonymous toward each other. We find that users can further minimize data collection and preserve their anonymity if they aggregate multiple ratings on the same item into a single rating and change their identifiers between connections.
Read full abstract