Abstract

High dimensional learning is a perennial problem due to challenges posed by the “curse of dimensionality”; learning typically demands more computing resources as well as more training data. In differentially private (DP) settings, this is further exacerbated by noise that needs adding to each dimension to achieve the required privacy. In this paper, we present a surprisingly simple approach to address all of these concerns at once, based on histograms constructed on a low-dimensional random projection (RP) of the data. Our approach exploits RP to take advantage of hidden low-dimensional structures in the data, yielding both computational efficiency, and improved error convergence with respect to the sample size—whereby less training data suffice for learning. We also propose a variant for efficient differentially private (DP) classification that further exploits the data-oblivious nature of both the histogram construction and the RP based dimensionality reduction, resulting in an efficient management of the privacy budget. We present a detailed and rigorous theoretical analysis of generalisation of our algorithms in several settings, showing that our approach is able to exploit low-dimensional structure of the data, ameliorates the ill-effects of noise required for privacy, and has good generalisation under minimal conditions. We also corroborate our findings experimentally, and demonstrate that our algorithms achieve competitive classification accuracy in both non-private and private settings.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.