Agnostically Learning under Permutation Invariant Distributions

Karl Wimmer

doi:10.1109/focs.2010.17

Abstract

We generalize algorithms from computational learning theory that are successful under the uniform distribution on the Boolean hypercube {0,1} <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">n</sup> to algorithms successful on permutation invariant distributions. A permutation invariant distribution is a distribution where the probability mass remains constant upon permutations in the instances. While the tools in our generalization mimic those used for the Boolean hypercube, the fact that permutation invariant distributions are not product distributions presents a significant obstacle. Under the uniform distribution, halfspaces can be agnostically learned in polynomial time for constant e. The main tools used are a theorem of Peres [Per04] bounding the noise sensitivity of a halfspace, a result of [KOS04] that this theorem implies Fourier concentration, and a modification of the Low-Degree algorithm of Linial, Mansour, Nisan [LMN93] made by Kalai et. al. [KKMS08]. These results are extended to arbitrary product distributions in [BOW08]. We prove analogous results for permutation invariant distributions; more generally, we work in the domain of the symmetric group. We define noise sensitivity in this setting, and show that noise sensitivity has a nice combinatorial interpretation in terms of Young tableaux. The main technical innovations involve techniques from the representation theory of the symmetric group, especially the combinatorics of Young tableaux. We show that low noise sensitivity implies concentration on "simple" components of the Fourier spectrum, and that this fact will allow us to agnostically learn halfspaces under permutation invariant distributions to constant accuracy in roughly the same time as in the uniform distribution over the Boolean hypercube case.

Full Text