Abstract

We show that the derivative of the relative entropy with respect to its parameters is lower and upper bounded. We characterize the conditions under which this derivative can reach zero. We use these results to explain when the minimum relative entropy and the maximum log likelihood approaches can be valid. We show that these approaches naturally activate in the presence of large data sets and that they are inherent properties of any density estimation process involving large numbers of random variables.

Highlights

  • Given a large ensemble of i.i.d.random variables, all of them generated according to some common density function, the asymptotic equipartition principle [1] guarantees that only a fraction of the possible ensembles, which is called the typical set, gathers almost all the probability ([2] [p. 226])

  • This fact opens interesting possibilities: What if the properties of the typical set impose conditions on any estimation process, such that the parameter search is focused on just a small subset of the available parameter set? We explore this using generalizations of the Shannon differential entropy [1] and the Fisher information [3,4,5] under typical set considerations

  • We cannot use a comparison between the value of the log likelihood term and that of the Shannon differential entropy to determine the goodness of our solution, as we could do in the case of the mixed entropy, because calculating the latter is analogous to the very problem we are trying to solve

Read more

Summary

Introduction

Given a large ensemble of i.i.d.random variables, all of them generated according to some common density function, the asymptotic equipartition principle [1] guarantees that only a fraction of the possible ensembles, which is called the typical set, gathers almost all the probability ([2] [p. 226]). Given a large ensemble of i.i.d.random variables, all of them generated according to some common density function, the asymptotic equipartition principle [1] guarantees that only a fraction of the possible ensembles, which is called the typical set, gathers almost all the probability Under the conditions that activate these bounds, we characterize when the minimum relative entropy and maximum log likelihood approaches render the same solution. We show that these equivalences become true by the sole fact of having a large number of random variables

Known Density Functions Case
Mixed Entropy Derivative: A Proxy for the Relative Entropy Derivative
Mixed Entropy Typical Set
Micro-Differences in Mixed Entropy Typical Sets
Mixed Entropy Derivative Bounds
Compliance with Theorem 2
10. Discussion
11. Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call