Abstract
The population recovery problem is a basic problem in noisy unsupervised learning that has attracted significant research attention in recent years [WY12,DRWY12, MS13, BIMP13, LZ15,DST16]. A number of different variants of this problem have been studied, often under assumptions on the unknown distribution (such as that it has restricted support size). In this work we study the sample complexity and algorithmic complexity of the most general version of the problem, under both bit-flip noise and erasure noise model. We give essentially matching upper and lower sample complexity bounds for both noise models, and efficient algorithms matching these sample complexity bounds up to polynomial factors.
Highlights
For the sake of a compact representation, we assume the learner only lists the nonzero values of D; this means that a successful learner need only list O(1/ε) nonzero values
For the bit-flip noise population recovery problem, our main result is a lower bound on the sample complexity of estimation, as well as a full noisy population recovery (NPR) algorithm whose running time matches it up to polynomial factors
An earlier paper by Moitra and Saks [11] gave an algorithm with sample complexity and running time (n/ε)O(log(1/ν)/ν)
Summary
The noisy population recovery (NPR) problem is to learn an unknown probability distribution D on {0, 1}n, under ν-noise, to ∞-accuracy ε.1 In this problem the learner gets access to independent samples y,. The noisy population recovery (NPR) problem is to learn an unknown probability distribution D on {0, 1}n, under ν-noise, to ∞-accuracy ε.1. In this problem the learner gets access to independent samples y,. Each coordinate of x is retained with probability ν (as in erasure noise), and is otherwise replaced with a uniformly random bit. This is the model of noise associated with the so-called “Bonami noise operator” Tν (see [13] for the precise description and many applications of this operator)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have