Abstract
Distribution testing deals with what information can be deduced about an unknown distribution over $$\{1,\ldots ,n\}$$ , where the algorithm is only allowed to obtain a relatively small number of independent samples from the distribution. In the extended conditional sampling model, the algorithm is also allowed to obtain samples from the restriction of the original distribution on subsets of $$\{1,\ldots ,n\}$$ . In 2015, Canonne, Diakonikolas, Gouleakis and Rubinfeld unified several previous results, and showed that for any property of distributions satisfying a “decomposability” criterion, there exists an algorithm (in the basic model) that can distinguish with high probability distributions satisfying the property from distributions that are far from it in the variation distance. We present here a more efficient yet simpler algorithm for the basic model, as well as very efficient algorithms for the conditional model, which until now was not investigated under the umbrella of decomposable properties. Additionally, we provide an algorithm for the conditional model that handles a much larger class of properties. Our core mechanism is an algorithm for efficiently producing an interval-partition of $$\{1,\ldots ,n\}$$ that satisfies a “fine-grain” quality. We show that with such a partition at hand we can avoid the search for the “correct” partition of $$\{1,\ldots ,n\}$$ .
Highlights
1.1 Historical backgroundIn most computational problems that arise from modeling real-world situations, we are required to analyze large amounts of data to decide if it satisfies a fixed property
There has been a long line of research, especially in statistics, where the underlying object from which we obtain the data is modeled as a probability distribution
We study distribution testing in the standard sampling model, as well as in the conditional model
Summary
In most computational problems that arise from modeling real-world situations, we are required to analyze large amounts of data to decide if it satisfies a fixed property. L-decomposable, there is an efficient algorithm for testing whether a given distribution belongs to the property C To achieve their results, Canonne et al ([8]) show that if a distribution μ supported over [n] is L-decomposable, it is O(L log n)-decomposable where the intervals are of the form [j2i + 1, (j + 1)2i]. Canonne et al ([8]) show that if a distribution μ supported over [n] is L-decomposable, it is O(L log n)-decomposable where the intervals are of the form [j2i + 1, (j + 1)2i] This presents a natural approach of computing the interval partition in a recursive manner, by bisecting an interval if it has a large probability weight and is not close to uniform. For further elaboration of this connection see [11]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.