Strongly universally consistent nonparametric regression and classification with privatised data

Thomas B Berrett,Harro Walk,László Györfi

doi:10.1214/21-ejs1845

Abstract

In this paper we revisit the classical problem of nonparametric regression, but impose local differential privacy constraints. Under such constraints, the raw data (X1,Y1),...,(Xn,Yn), taking values in Rd×R, cannot be directly observed, and all estimators are functions of the randomised output from a suitable privacy mechanism. The statistician is free to choose the form of the privacy mechanism, and here we add Laplace distributed noise to a discretisation of the location of a feature vector Xi and to the value of its response variable Yi. Based on this randomised data, we design a novel estimator of the regression function, which can be viewed as a privatised version of the well-studied partitioning regression estimator. The main result is that the estimator is strongly universally consistent, and we further establish an upper bound on the rate of convergence. Our methods and analysis also give rise to a strongly universally consistent binary classification rule for locally differentially private data.

Highlights

In recent years there has been a surge of interest in data analysis methodology that is able to achieve strong statistical performance without comprimising the privacy and security of individual data holders
The concept of differential privacy [15] was introduced to provide a rigorous notion of the amount of private information on individuals published statistics contain. Statistical treatments of this framework include [36, 23, 2, 6]. It is a suitable constraint for many problems, procedures that are differentially private often require the presence of a third party, who may be trusted to handle the raw data before statistics are published
The local differential privacy constraint [see, for example, 21, 12, and the references therein] was introduced to provide a setting where analysis must be carried out in such a way that each raw data point is only ever seen by the original data holder

Summary

Introduction

In recent years there has been a surge of interest in data analysis methodology that is able to achieve strong statistical performance without comprimising the privacy and security of individual data holders. It is a suitable constraint for many problems, procedures that are differentially private often require the presence of a third party, who may be trusted to handle the raw data before statistics are published To address this shortcoming, the local differential privacy constraint [see, for example, 21, 12, and the references therein] was introduced to provide a setting where analysis must be carried out in such a way that each raw data point is only ever seen by the original data holder. The problem of classification is strictly easier than regression, our methods and analysis give rise to a strongly universally consistent binary classification rule for locally differentially private data.

Preliminaries

Our regression estimation method and its strong universal consistency

Local differential privacy

Consequences in classification

Proofs and auxiliary results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2021
Citations: 4	License type: cc-by

R Discovery Prime

R Discovery Prime

Strongly universally consistent nonparametric regression and classification with privatised data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

Confidence regions, non-parametric regression
...
-
, et. al. ...
25 May 2007
25 May 2007

Variational Estimators in Statistical Multiscale Analysis
Housen Li
-
Housen LiHousen Li
21 Feb 2022
21 Feb 2022

Variance Function Estimation in Regression: The Effect of Estimating the Mean
Peter Hall ... R J Carroll
Journal of the Royal Statistical Society Series B: Statistical Methodology | VOL. 51
Peter Hall, et. al.Peter Hall ... R J Carroll
01 Sep 1989
Journal of the Royal Statistical Society Series B: Statistical Methodology | VOL. 51

A Privacy-Preserving Game Model for Local Differential Privacy by Using Information-Theoretic Approach
Ningbo Wu ... Kun Niu
IEEE Access | VOL. 8
Ningbo Wu, et. al.Ningbo Wu ... Kun Niu
01 Jan 2020
IEEE Access | VOL. 8

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Strongly universally consistent nonparametric regression and classification with privatised data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics