Near-optimal Sample Complexity Bounds for Robust Learning of Gaussian Mixtures via Compression Schemes

Hassan Ashtiani,Christopher Liaw,Nicholas J A Harvey,Shai Ben-David,Abbas Mehrabian,Yaniv Plan

doi:10.1145/3417994

Abstract

We introduce a novel technique for distribution learning based on a notion of sample compression . Any class of distributions that allows such a compression scheme can be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. As an application of this technique, we prove that ˜Θ( kd 2 /ε 2 ) samples are necessary and sufficient for learning a mixture of k Gaussians in R d , up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that Õ( kd /ε 2 ) samples suffice, matching a known lower bound. Moreover, these results hold in an agnostic learning (or robust estimation) setting, in which the target distribution is only approximately a mixture of Gaussians. Our main upper bound is proven by showing that the class of Gaussians in R d admits a small compression scheme.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Near-optimal Sample Complexity Bounds for Robust Learning of Gaussian Mixtures via Compression Schemes

Abstract

Talk to us

Similar Papers

More From: Journal of the ACM

Lead the way for us

Journal: Journal of the ACM	Publication Date: Oct 6, 2020
Citations: 9

Similar Papers

Bound for an Approximation of Invariant Density of Diffusions via Density Formula in Malliavin Calculus
Yoon-Tae Kim ... Hyun-Suk Park
Mathematics | VOL. 11
Yoon-Tae Kim, et. al.Yoon-Tae Kim ... Hyun-Suk Park
15 May 2023
Mathematics | VOL. 11

On the total variation and Hellinger distance between signed measures; an application to product measures
Ton Steerneman
Proceedings of the American Mathematical Society | VOL. 88
Ton SteernemanTon Steerneman
01 Jan 1982
Proceedings of the American Mathematical Society | VOL. 88

On the Total Variation and Hellinger Distance Between Signed Measures; an Application to Product Measures
Ton Steerneman
Proceedings of the American Mathematical Society | VOL. 88
Ton SteernemanTon Steerneman
01 Aug 1983
Proceedings of the American Mathematical Society | VOL. 88

Efficient density estimation via piecewise polynomial approximation
Siu-On Chan ... Ilias Diakonikolas
-
Siu-On Chan, et. al.Siu-On Chan ... Ilias Diakonikolas
31 May 2014
31 May 2014

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Near-optimal Sample Complexity Bounds for Robust Learning of Gaussian Mixtures via Compression Schemes

Abstract

Talk to us

Similar Papers

More From: Journal of the ACM