On the Optimal Boolean Function for Prediction Under Quadratic Loss

Nir Weinberger,Ofer Shayevitz

doi:10.1109/tit.2017.2686437

Abstract

Suppose $Y^{n}$ is obtained by observing a uniform Bernoulli random vector $X^{n}$ through a binary symmetric channel. Courtade and Kumar asked how large the mutual information between $Y^{n}$ and a Boolean function $ \mathsf {b}(X^{n})$ could be, and conjectured that the maximum is attained by a dictator function. An equivalent formulation of this conjecture is that dictator minimizes the prediction cost in a sequential prediction of $Y^{n}$ under logarithmic loss , given $ \mathsf {b}(X^{n})$ . In this paper, we study the question of minimizing the sequential prediction cost under a different (proper) loss function—the quadratic loss . In the noiseless case, we show that majority asymptotically minimizes this prediction cost among all Boolean functions. We further show that for weak noise, majority is better than a dictator, and that for a strong noise dictator outperforms majority. We conjecture that for quadratic loss, there is no single sequence of Boolean functions that is simultaneously (asymptotically) optimal at all noise levels.

Full Text