Abstract
In Bayesian statistics probability distributions express beliefs. However, for many problems the beliefs cannot be computed analytically and approximations of beliefs are needed. We seek a loss function that quantifies how “embarrassing” it is to communicate a given approximation. We reproduce and discuss an old proof showing that there is only one ranking under the requirements that (1) the best ranked approximation is the non-approximated belief and (2) that the ranking judges approximations only by their predictions for actual outcomes. The loss function that is obtained in the derivation is equal to the Kullback-Leibler divergence when normalized. This loss function is frequently used in the literature. However, there seems to be confusion about the correct order in which its functional arguments—the approximated and non-approximated beliefs—should be used. The correct order ensures that the recipient of a communication is only deprived of the minimal amount of information. We hope that the elementary derivation settles the apparent confusion. For example when approximating beliefs with Gaussian distributions the optimal approximation is given by moment matching. This is in contrast to many suggested computational schemes.
Highlights
In Bayesian statistics, probabilities are interpreted as degrees of belief
The loss function that is obtained in the derivation is equal to the Kullback-Leibler divergence when normalized
This loss function is frequently used in the literature
Summary
In Bayesian statistics, probabilities are interpreted as degrees of belief. For any set of mutually exclusive and exhaustive events, one expresses the state of knowledge as a probability distribution over that set. The probability of an event describes the personal confidence that this event will happen or has happened. If the set of possible mutually exclusive and exhaustive events is infinite, it is generally impossible to store all entries of the corresponding probability distribution on a computer or communicate it through a channel with finite bandwidth. One needs to approximate the probability distribution which describes one’s belief. Given a limited set X of approximative beliefs q(s) on a quantity s, what is the best belief to approximate the actual belief as expressed by the probability p(s)?
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.