Abstract
Let be a class of probability distributions over the discrete domain [n] = {1, …, n}. We show that if satisfies a rather general condition – essentially, that each distribution in can be well-approximated by a variable-width histogram with few bins – then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of k unknown distributions from . We analyze several natural types of distributions over [n], including log-concave, monotone hazard rate and unimodal distributions, and show that they have the required structural property of being well-approximated by a histogram with few bins. Applying our general algorithm, we obtain near-optimally efficient algorithms for all these mixture learning problems as described below. More precisely, Log-concave distributions: We learn any mixture of k log-concave distributions over [n] using k · Õ(1/ε4) samples (independent of n) and running in time Õ(k log(n)/ε4) bit-operations (note that reading a single sample from [n] takes Θ(log n) bit operations). For the special case k = 1 we give an efficient algorithm using Õ(1/ε3) samples; this generalizes the main result of [DDS12b] from the class of Poisson Binomial distributions to the much broader class of all log-concave distributions. Our upper bounds are not far from optimal since any algorithm for this learning problem requires Ω(k/ε5/2) samples. Monotone hazard rate (MHR) distributions: We learn any mixture of k MHR distributions over [n] using O(k log(n/ε)/ε4) samples and running in time Õ(k log (n)/ε4) bit-operations. Any algorithm for this learning problem must use Ω(k log(n)/ε3) samples. Unimodal distributions: We give an algorithm that learns any mixture of k unimodal distributions over [n] using O(k log(n)/ε4) samples and running in time Õ(k log2(n)/ε4) bit-operations. Any algorithm for this problem must use Ω(k log(n)/ε3) samples.
Submitted Version (Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.