Abstract
We prove nearly matching upper and lower bounds on the randomized communication complexity of the following problem: Alice and Bob are each given a probability distribution over n elements, and they wish to estimate within ±ε the statistical (total variation) distance between their distributions. For some range of parameters, there is up to a log n factor gap between the upper and lower bounds, and we identify a barrier to using information complexity techniques to improve the lower bound in this case. We also prove a side result that we discovered along the way: the randomized communication complexity of n -bit Majority composed with n -bit Greater Than is Θ ( n log n ).
Highlights
Statistical (a.k.a. total variation) distance is a standard measure of the distance between two probability distributions, and is ubiquitous in theoretical computer science
It is natural to inquire about the computational complexity of estimating the statistical distance between two distributions x and y that are given as input
[25] showed that when each of x and y is succinctly represented by an algorithm that takes uniform random bits and produces a sample from that distribution, the problem of estimating ∆(x, y) is complete for the complexity class SZK. (For results about the complexity of other problems where the inputs are succinctly represented distributions, see [12, 13, 3, 14, 30, 29].)
Summary
Statistical (a.k.a. total variation) distance is a standard measure of the distance between two probability distributions, and is ubiquitous in theoretical computer science. It is natural to inquire about the computational complexity of estimating the statistical distance between two distributions x and y that are given as input. This topic has been studied before in several contexts:. (For results about the complexity of other problems where the inputs are black-box samples from distributions, see the surveys [14, 24, 7].) [10, 11] studied the space complexity of (a generalization of) statistical distance estimation when the vectors x and y are provided as data streams 49:2 Communication Complexity of Statistical Distance [2, 27, 9] studied the complexity of statistical distance estimation when an algorithm is only given black-box access to oracles that produce samples from the distributions specified by x and y. (For results about the complexity of other problems where the inputs are black-box samples from distributions, see the surveys [14, 24, 7].) [10, 11] studied the space complexity of (a generalization of) statistical distance estimation when the vectors x and y are provided as data streams
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.