Abstract

Maximum entropy (MAXENT) is a widespread method for recovering unknown probabilities of a random variable Z. The method uses first few (empiric) moments of Z and leads to non-parametric estimators. Here we study MAXENT in a Bayesian set-up assuming that there exists a well-defined Dirichlet density for unknown probabilities. This allows to employ the average Kullback-Leibler (KL) distance in evaluating various MAXENT constraints, and comparing MAXENT with parametric estimators: the regularized maximum likelihood (ML) and the Bayesian estimator. KL distance is singled out among other distances from demanding a weak consistency of MAXENT with respect to a shrinkage. We show that MAXENT applies to sparse data, provided that there are certain types of prior information, e.g. the probabilities of Z are most probably deterministic, or there exist prior conditional rank correlations between Z and its probabilities. In the latter case MAXENT can outperform the optimally regularized ML and produce results that are close to the globally optimal Bayes estimator. Predictions of MAXENT are meaningless (i.e. worse than a random guess) if the needed prior information is absent.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call