We consider the problem of discrete distribution estimation under $\ell _{1}$ loss. We provide tight upper and lower bounds on the maximum risk of the empirical distribution (the maximum likelihood estimator), and the minimax risk in regimes where the support size $S$ may grow with the number of observations $n$ . We show that among distributions with bounded entropy $H$ , the asymptotic maximum risk for the empirical distribution is $2H/\ln n$ , while the asymptotic minimax risk is $H/\ln n$ . Moreover, we show that a hard-thresholding estimator oblivious to the unknown upper bound $H$ , is essentially minimax. However, if we constrain the estimates to lie in the simplex of probability distributions, then the asymptotic minimax risk is again $2H/\ln n$ . We draw connections between our work and the literature on density estimation, entropy estimation, total variation distance ( $\ell _{1}$ divergence) estimation, joint distribution estimation in stochastic processes, normal mean estimation, and adaptive estimation.
Read full abstract