Accelerate Literature Icon
Want to do a literature review? Try our new Literature Review workflow

Channel Simulation: Theory and Applications to Lossy Compression and Differential Privacy

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon

One-shot channel simulation (or channel synthesis) has seen increasing applications in lossy compression, differential privacy and machine learning. In this setting, an encoder observes a source X, and transmits a description to a decoder, so as to allow it to produce an output Y with a desired conditional distribution PY|X. In other words, the encoder and the decoder are simulating the noisy channel PY|X> using noiseless communication. This can also be seen as a lossy compression scheme with a stronger guarantee on the joint distribution of X and Y. This monograph gives an overview of the theory and applications of the channel simulation problem. We will present a unifying review of various one-shot and asymptotic channel simulation techniques that have been proposed in different areas, namely dithered quantization, rejection sampling, minimal random coding, likelihood encoder, soft covering, Poisson functional representation, and dyadic decomposition.

Similar Papers
  • Research Article
  • Cite Count Icon 10
  • 10.1109/tit.2018.2865386
On Privacy Amplification, Lossy Compression, and Their Duality to Channel Coding
  • Dec 1, 2018
  • IEEE Transactions on Information Theory
  • Joseph M Renes

We examine the task of privacy amplification from information-theoretic and coding-theoretic points of view. In the former, we give a one-shot characterization of the optimal rate of privacy amplification against classical adversaries in terms of the optimal type-II error in asymmetric hypothesis testing. This formulation can be easily computed to give finite-blocklength bounds and turns out to be equivalent to smooth min-entropy bounds by Renner and Wolf [Asiacrypt 2005] and Watanabe and Hayashi [ISIT 2013], as well as a bound in terms of the $E_\gamma$ divergence by Yang, Schaefer, and Poor [arXiv:1706.03866 [cs.IT]]. In the latter, we show that protocols for privacy amplification based on linear codes can be easily repurposed for channel simulation. Combined with known relations between channel simulation and lossy source coding, this implies that privacy amplification can be understood as a basic primitive for both channel simulation and lossy compression. Applied to symmetric channels or lossy compression settings, our construction leads to proto- cols of optimal rate in the asymptotic i.i.d. limit. Finally, appealing to the notion of channel duality recently detailed by us in [IEEE Trans. Info. Theory 64, 577 (2018)], we show that linear error-correcting codes for symmetric channels with quantum output can be transformed into linear lossy source coding schemes for classical variables arising from the dual channel. This explains a "curious duality" in these problems for the (self-dual) erasure channel observed by Martinian and Yedidia [Allerton 2003; arXiv:cs/0408008] and partly anticipates recent results on optimal lossy compression by polar and low-density generator matrix codes.

  • Conference Article
  • Cite Count Icon 166
  • 10.1109/icdm.2012.80
Differentially Private Histogram Publishing through Lossy Compression
  • Dec 1, 2012
  • Gergely Acs + 2 more

Differential privacy has emerged as one of the most promising privacy models for private data release. It can be used to release different types of data, and, in particular, histograms, which provide useful summaries of a dataset. Several differentially private histogram releasing schemes have been proposed recently. However, most of them directly add noise to the histogram counts, resulting in undesirable accuracy. In this paper, we propose two sanitization techniques that exploit the inherent redundancy of real-life datasets in order to boost the accuracy of histograms. They lossily compress the data and sanitize the compressed data. Our first scheme is an optimization of the Fourier Perturbation Algorithm (FPA) presented in \cite{RN10}. It improves the accuracy of the initial FPA by a factor of 10. The other scheme relies on clustering and exploits the redundancy between bins. Our extensive experimental evaluation over various real-life and synthetic datasets demonstrates that our techniques preserve very accurate distributions and considerably improve the accuracy of range queries over attributed histograms.

  • Research Article
  • 10.59275/j.melba.2025-113f
Data Exfiltration by Compression Attack: Definition and Evaluation on Medical Image Data
  • Dec 5, 2025
  • Machine Learning for Biomedical Imaging
  • Huiyu Li + 2 more

With the rapid expansion of data lakes storing health data and hosting AI algorithms, a prominent concern arises: how safe is it to export machine learning models from these data lakes? In particular, deep network models, widely used for health data processing, encode information from their training dataset, potentially leading to the leakage of sensitive information upon export. This paper thoroughly examines this issue in the context of medical imaging data and introduces a novel data exfiltration attack based on image compression techniques.<br>This attack, termed Data Exfiltration by Compression, requires only access to a data lake and is based on lossless or lossy image compression methods.<br>Unlike previous data exfiltration attacks, it is compatible with any image processing task and depends solely on an exported network model without requiring any additional information collected during the training process. We explore various scenarios, and techniques to limit the size of the exported model and conceals the compression codes within the network.<br>Using two public datasets of CT and MR images, we demonstrate that this attack can effectively steal medical images and reconstruct them outside the data lake with high fidelity, achieving an optimal balance between compression and reconstruction quality. Additionally, we investigate the impact of basic differential privacy measures, such as adding Gaussian noise to the model parameters, to prevent the data exfiltration by compression attack. We also show how the attacker can make its attack resilient to differential privacy at the expense of decreasing the number of stolen images. Lastly, we propose an alternative prevention strategy by fine-tuning the model to be exported.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/itw48936.2021.9611421
Multiple-Output Channel Simulation and Lossy Compression of Probability Distributions
  • Oct 17, 2021
  • Chak Fung Choi + 1 more

We consider a variant of the channel simulation problem with a single input and multiple outputs, where Alice observes a probability distribution P from a set of prescribed probability distributions $\mathcal{P}$, and sends a prefix-free codeword W to Bob to allow him to generate n i.i.d. random variables $X_{1}, X_{2}, \ldots, X_{n}$ which follow the distribution P. This can also be regarded as a lossy compression setting for probability distributions. This paper describes encoding schemes for three cases of $P: P$ is a distribution over positive integers, P is a continuous distribution over $[0,1]$ with a non-increasing pdf, and P is a continuous distribution over $[0, \infty)$ with a nonincreasing pdf. We show that the growth rate of the expected codeword length is sub-linear in n when a power law bound is satisfied. An application of multiple-outputs channel simulation is the compression of probability distributions. A full version of this paper is accessible at: https://arxiv.org/pdf/2105.01045.pdf

  • Research Article
  • Cite Count Icon 57
  • 10.1109/tsp.2023.3244092
Joint Privacy Enhancement and Quantization in Federated Learning
  • Jan 1, 2023
  • IEEE Transactions on Signal Processing
  • Natalie Lang + 3 more

Federated learning (FL) is an emerging paradigm for training machine learning models using possibly private data available at edge devices. The distributed operation of FL gives rise to challenges that are not encountered in centralized machine learning, including the need to preserve the privacy of the local datasets, and the communication load due to the repeated exchange of updated models. These challenges are often tackled individually via techniques that induce some distortion on the updated models, e.g., local differential privacy (LDP) mechanisms and lossy compression. In this work we propose a method coined <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">joint privacy enhancement and quantization (JoPEQ)</i> , which jointly implements lossy compression and privacy enhancement in FL settings. In particular, JoPEQ utilizes vector quantization based on random lattice, a universal compression technique whose byproduct distortion is statistically equivalent to additive noise. This distortion is leveraged to enhance privacy by augmenting the model updates with dedicated multivariate privacy preserving noise. We show that JoPEQ simultaneously quantizes data according to a required bit-rate while holding a desired privacy level, without notably affecting the utility of the learned model. This is shown via analytical LDP guarantees, distortion and convergence bounds derivation, and numerical studies. Finally, we empirically assert that JoPEQ demolishes common attacks known to exploit privacy leakage.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 23
  • 10.1109/access.2020.3041854
A Privacy-Preserving Game Model for Local Differential Privacy by Using Information-Theoretic Approach
  • Jan 1, 2020
  • IEEE Access
  • Ningbo Wu + 2 more

Local differential privacy (LDP) is an effective privacy-preserving model to address the problems which do not have a trusted entity. The main idea of the LDP is to add randomness in real data to guarantee individual's private sensitive information. Here, the technology of randomized response is an effective method to realize the LDP mechanism. In fact, the randomized response is a probabilistic mapping from the real data to perturbed data, which can be modeled as an information-theoretic lossy compression mechanism. What's more, the privacy budget &#x03F5; has become a de facto standard to quantify the worst-case privacy leakage. However, such a metrics can not capture the question that which one is the optimal privacy mechanism in a set of equivalent &#x03F5;-privacy mechanisms. Besides, the privacy and utility are closely correlated with the privacy mechanism, and existing methods do not consider the strategic adversary's behavior. In this paper, we tackle the problem of tradeoffs privacy and utility under the rational framework within an information-theoretic approach as the metrics. To address the problem, we first formulate this trade-off as a minimax information leakage problem. Then, we propose a privacy preserving attack and defense (PPAD) game framework, that is, a two-person zero-sum (TPZS) game. Further, we develop an alternating optimization algorithm to compute the saddle point of the proposed PPAD game. As a case study, we apply our method to compare several alternative ln2-privacy mechanisms, the experimental result demonstrates that can provide an effective method to compare equivalent &#x03F5;-privacy mechanisms. Furthermore, the numeric simulation result confirms that the proposed method also be useful for the protector to assess privacy disclosure risks.

  • Conference Article
  • Cite Count Icon 13
  • 10.1109/isit50566.2022.9834551
Joint Privacy Enhancement and Quantization in Federated Learning
  • Jun 26, 2022
  • Natalie Lang + 1 more

Federated learning (FL) is an emerging paradigm for training machine learning models using possibly private data available at edge devices. Among the key challenges associated with FL are first the need to preserve the privacy of the local data sets, and second the communication load due to the repeated exchange of updated models; both are often tackled individually with methods whose operation distorts the updated models, e.g., local differential privacy (LDP) mechanisms and lossy compres- sion, respectively. In this work we propose a method for joint privacy enhancement and quantization (JoPEQ), unifying lossy compression and privacy enhancement for FL. JoPEQ utilizes universal vector quantization, where distortion is statistically equivalent to additive noise, and augments the compression distortion with dedicated privacy preserving noise to simultaneously achieve compression and a desired privacy level. We numerically demonstrate that JoPEQ reduces the overall distortion compared to individual LDP and compression, which is translated into improved trained models.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant