Programming Differential Privacy
Programming Differential Privacy
- Research Article
11
- 10.1016/j.automatica.2022.110722
- Nov 29, 2022
- Automatica
Differential initial-value privacy and observability of linear dynamical systems
- Conference Article
6
- 10.1109/bigcom.2019.00029
- Aug 1, 2019
Mobile Crowdsensing (MCS) has become an effective technology for urban data sensing and acquisition. But this also brings the risk of trajectory privacy disclosure for participants. Most of the existing efforts attempt to add noise into the reported location information to achieve the trajectory privacy protection of the participating users. However, in many scenarios, the participants are required to report the real-location information (e.g., high-quality map generation, traffic flow monitoring, etc.). To address this problem, we propose a differential privacy based trajectory privacy protection scheme with real-location reporting in MCS. First, we present the definition of trajectory privacy protection based on real path reporting under differential privacy. Second, we give a differential trajectory privacy protection framework to achieve participants trajectory privacy protection under Bayesian inference attacks. Finally, we analyze and prove that differential trajectory privacy problem is an NP-Hard problem. Meanwhile, we also design an approximate algorithm to report participants road segment with the trajectory privacy guarantee. The experimental results on both the simulated data set and the real data set show that our proposed participant's trajectory privacy protection scheme has a good performance.
- Research Article
15
- 10.1016/j.procs.2014.09.013
- Jan 1, 2014
- Procedia Computer Science
Applying Moving Average Filtering for Non-interactive Differential Privacy Settings
- Conference Article
5
- 10.1109/isit44484.2020.9174484
- Jun 1, 2020
Differential privacy (DP) is an influential privacy measure and has been studied to protect private data. DP has been often studied in classical probability theory, but few researchers studied quantum versions of DP. In this paper, we consider classical-quantum DP mechanisms which (i) convert binary private data to quantum states and (ii) satisfy a quantum version of the DP constraint. The class of classical-quantum DP mechanisms contains classical DP mechanisms. As a main result, we show that some classical DP mechanism optimizes any information quantity satisfying the information processing inequality. Therefore, the performance of classical DP mechanisms attains that of classical-quantum DP mechanisms.
- Research Article
- 10.52710/cfs.363
- Feb 13, 2025
- Computer Fraud and Security
Privacy Preserving Federated Learning Efficiency Optimization Algorithm based on Differential Privacy
- Research Article
- 10.1016/j.ijar.2024.109242
- Jul 2, 2024
- International Journal of Approximate Reasoning
General inferential limits under differential and Pufferfish privacy
- Research Article
- 10.3233/sji-200685
- Jan 1, 2020
- Statistical Journal of the IAOS
Differential privacy (DP) has emerged in the computer science literature as a measure of the impact on an individual’s privacy resulting from the publication of a statistical output such as a frequency table. This paper provides an introduction to DP for official statisticians and discuss its relevance, benefits and challenges from a National Statistical Organisation (NSO) perspective. We motivate our study by examining how privacy is evolving in the era of big data and how this might prompt a shift from traditional statistical disclosure techniques used in official statistics – which are generally applied on a cell-by-cell or table-by-table basis – to formal privacy methods, like DP, which are applied from a perspective encompassing the totality of the outputs generated from a given dataset. We identify an important interplay between DP’s holistic privacy risk measure and the difficulty for NSOs in implementing DP, showing that DP’s major advantage is also DP’s major challenge. This paper provides new work addressing two key DP research areas for NSOs: DP’s application to survey data and its incorporation within the Five Safes framework.
- Research Article
92
- 10.1613/jair.1.14649
- Jul 23, 2023
- Journal of Artificial Intelligence Research
Machine Learning (ML) models are ubiquitous in real-world applications and are a constant focus of research. Modern ML models have become more complex, deeper, and harder to reason about. At the same time, the community has started to realize the importance of protecting the privacy of the training data that goes into these models. Differential Privacy (DP) has become a gold standard for making formal statements about data anonymization. However, while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between. The adoption of DP is hindered by limited practical guidance of what DP protection entails, what privacy guarantees to aim for, and the difficulty of achieving good privacy-utility-computation trade-offs for ML models. Tricks for tuning and maximizing performance are scattered among papers or stored in the heads of practitioners, particularly with respect to the challenging task of hyperparameter tuning. Furthermore, the literature seems to present conflicting evidence on how and whether to apply architectural adjustments and which components are “safe” to use with DP. In this survey paper, we attempt to create a self-contained guide that gives an in-depth overview of the field of DP ML. We aim to assemble information about achieving the best possible DP ML model with rigorous privacy guarantees. Our target audience is both researchers and practitioners. Researchers interested in DP for ML will benefit from a clear overview of current advances and areas for improvement. We also include theory-focused sections that highlight important topics such as privacy accounting and convergence. For a practitioner, this survey provides a background in DP theory and a clear step-by-step guide for choosing an appropriate privacy definition and approach, implementing DP training, potentially updating the model architecture, and tuning hyperparameters. For both researchers and practitioners, consistently and fully reporting privacy guarantees is critical, so we propose a set of specific best practices for stating guarantees. With sufficient computation and a sufficiently large training set or supplemental nonprivate data, both good accuracy (that is, almost as good as a non-private model) and good privacy can often be achievable. And even when computation and dataset size are limited, there are advantages to training with even a weak (but still finite) formal DP guarantee. Hence, we hope this work will facilitate more widespread deployments of DP ML models.
- Research Article
8
- 10.1145/3651153
- Apr 26, 2024
- ACM Computing Surveys
Differential privacy has been a de facto privacy standard in defining privacy and handling privacy preservation. It has had great success in scenarios of local data privacy and statistical dataset privacy. As a primitive definition, standard differential privacy has been adapted to a wide range of practical scenarios. In this work, we summarize differential privacy adaptations in specific scenarios and analyze the correlations between data characteristics and differential privacy design. We mainly present them in two lines including differential privacy adaptations in local data privacy and differential privacy adaptations in statistical dataset privacy. With a focus on differential privacy design, this survey targets providing guiding rules in differential privacy design for scenarios, together with identifying potential opportunities to adaptively apply differential privacy in more emerging technologies and further improve differential privacy itself with the assistance of cryptographic primitives.
- Research Article
3
- 10.1016/j.ic.2017.03.002
- Mar 22, 2017
- Information and Computation
Differential privacy in probabilistic systems
- Research Article
90
- 10.29012/jpc.689
- Oct 20, 2019
- Journal of Privacy and Confidentiality
Differential privacy is at a turning point. Implementations have been successfully leveraged in private industry, the public sector, and academia in a wide variety of applications, allowing scientists, engineers, and researchers the ability to learn about populations of interest without specifically learning about these individuals. Because differential privacy allows us to quantify cumulative privacy loss, these differentially private systems will, for the first time, allow us to measure and compare the total privacy loss due to these personal data-intensive activities. Appropriately leveraged, this could be a watershed moment for privacy. Like other technologies and techniques that allow for a range of instantiations, implementation details matter. When meaningfully implemented, differential privacy supports deep data-driven insights with minimal worst-case privacy loss. When not meaningfully implemented, differential privacy delivers privacy mostly in name. Using differential privacy to maximize learning while providing a meaningful degree of privacy requires judicious choices with respect to the privacy parameter epsilon, among other factors. However, there is little understanding of what is the optimal value of epsilon for a given system or classes of systems/purposes/data etc. or how to go about figuring it out. To understand current differential privacy implementations and how organizations make these key choices in practice, we conducted interviews with practitioners to learn from their experiences of implementing differential privacy. We found no clear consensus on how to choose epsilon, nor is there agreement on how to approach this and other key implementation decisions. Given the importance of these implementation details there is a need for shared learning amongst the differential privacy community. To serve these purposes, we propose the creation of the Epsilon Registry—a publicly available communal body of knowledge about differential privacy implementations that can be used by various stakeholders to drive the identification and adoption of judicious differentially private implementations.
- Research Article
77
- 10.1109/tsp.2020.3006760
- Jan 1, 2020
- IEEE Transactions on Signal Processing
Differential privacy is a formal mathematical framework for quantifying the degree of individual privacy in a statistical database.To guarantee differential privacy, a typical method is to add random noise to the original data for data release. In this paper, we investigate the conditions of differential privacy (single-dimensional case) considering the general random noise adding mechanism, and then apply the obtained results for privacy analysis of the privacy-preserving consensus algorithm. Specifically, we obtain a necessary and sufficient condition of $\epsilon$ -differential privacy, and the sufficient conditions of $(\epsilon, \delta)$ -differential privacy. We apply them to analyze various random noises. For the special cases with known results, our theory not only matches with the literature, but also provides an efficient approach to the privacy parameters’ estimation; for other cases that are unknown, our approach provides a simple and effective tool for differential privacy analysis. Applying the obtained theory on privacy-preserving consensus algorithm, we obtain the necessary condition and the sufficient condition to ensure differential privacy.
- Research Article
29
- 10.1162/99608f92.63a22079
- Jan 31, 2020
- Harvard Data Science Review
Accessing and combining large amounts of data is important for quantitative social scientists, but increasing amounts of data also increase privacy risks. To mitigate these risks, important players in official statistics, academia, and business see a solution in the concept of differential privacy. In this opinion piece, we ask how differential privacy can benefit from social-scientific insights, and, conversely, how differential privacy is likely to transform social science. First, we put differential privacy in the larger context of social science. We argue that the discussion on implementing differential privacy has been clouded by incompatible subjective beliefs about risk, each perspective having merit for different data types. Moreover, we point out existing social-scientific insights that suggest limitations to the premises of differential privacy as a data protection approach. Second, we examine the likely consequences for social science if differential privacy is widely implemented. Clearly, workflows must change, and common social science data collection will become more costly. However, in addition to data protection, differential privacy may bring other positive side effects. These could solve some issues social scientists currently struggle with, such as p-hacking, data peeking, or overfitting; after all, differential privacy is basically a robust method to analyze data. We conclude that, in the discussion around privacy risks and data protection, a large number of disciplines must band together to solve this urgent puzzle of our time, including social science, computer science, ethics, law, and statistics, as well as public and private policy.
- Research Article
1
- 10.1145/3729294
- Jun 10, 2025
- Proceedings of the ACM on Programming Languages
Differential privacy (DP) has become the gold standard for privacy-preserving data analysis, but implementing it correctly has proven challenging. Prior work has focused on verifying DP at a high level, assuming either that the foundations are correct or that a perfect source of random noise is available. However, the underlying theory of differential privacy can be very complex and subtle. Flaws in basic mechanisms and random number generation have been a critical source of vulnerabilities in real-world DP systems. In this paper, we present SampCert, the first comprehensive, mechanized foundation for executable implementations of differential privacy. SampCert is written in Lean with over 12,000 lines of proof. It offers a generic and extensible notion of DP, a framework for constructing and composing DP mechanisms, and formally verified implementations of Laplace and Gaussian sampling algorithms. SampCert provides (1) a mechanized foundation for developing the next generation of differentially private algorithms, and (2) mechanically verified primitives that can be deployed in production systems. Indeed, SampCert's verified algorithms power the DP offerings of Amazon Web Services, demonstrating its real-world impact. SampCert's key innovations include: (1) A generic DP foundation that can be instantiated for various DP definitions (e.g., pure, concentrated, Rényi DP); (2) formally verified discrete Laplace and Gaussian sampling algorithms that avoid the pitfalls of floating-point implementations; and (3) a simple probability monad and novel proof techniques that streamline the formalization. To enable proving complex correctness properties of DP and random number generation, SampCert makes heavy use of Lean's extensive Mathlib library, leveraging theorems in Fourier analysis, measure and probability theory, number theory, and topology.
- Research Article
8
- 10.1109/tfuzz.2022.3157385
- Feb 1, 2023
- IEEE Transactions on Fuzzy Systems
Transportation networks are essential to the operation of societies and economies. Protecting the privacy of sensitive information is a meaningful conception in sustainable transport when mining the transportation data. In data mining, differential privacy (DP) has provable privacy guarantees for releasing sensitive data by introducing randomness into query results. However, it suffers from significant accuracy loss of outputs when the query has high sensitivity (e.g., triangle counting). The reason is that the range of random perturbation to each query result in DP is too large. It consists of all possible output values for a query that forms a large or even unbounded interval. However, when impose perturbation only in a small neighborhood of the true query result, the similarity measure based on randomness in DP fails. Thereupon, we introduce fuzziness into DP to formulate new models which have smaller disturbance via fuzzy similarity measures. In this article, we establish a novel and general theory of private data analysis, fuzzy differential privacy (FDP). The new theory FDP aims to acquire a more flexible tradeoff between the accuracy of outputs and the privacy-preserving level of data. FDP combines DP with fuzzy set theory by introducing fuzziness into the query results and characterizing similarities between outputs via multiple fuzzy similarity measures. From this perspective, DP can be viewed as a special case of FDP with probabilistic similarity measure. Compared with DP, FDP has three superiorities: 1) most fuzzy similarity measures in FDP support sliding window perturbation strategies we proposed, which refer to perturbation in a small neighborhood of the query results; 2) FDP adds noise to the query results only according to a fraction of all possible neighboring datasets; and 3) the fuzzy similarity with valued in [0,1] quantifies the privacy protection level intuitively. These three points enable more accurate outputs while providing provable and intuitive privacy guarantees. As for subgraph counting, the state-of-the-art method is ladder framework in DP. We illustrate FDP mechanisms by applying them to a common application in subgraph counting–triangle/4-cliques counting. Experiments show that FDP is effective and efficient with smaller output errors than DP.
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.