Abstract

A common statistical situation concerns inferring an unknown distribution Q(x) from a known distribution P(y), where X (dimension n), and Y (dimension m) have a known functional relationship. Most commonly, n ≤ m, and the task is relatively straightforward for well-defined functional relationships. For example, if Y1 and Y2 are independent random variables, each uniform on [0, 1], one can determine the distribution of X = Y1 + Y2; here m = 2 and n = 1. However, biological and physical situations can arise where n > m and the functional relation Y→X is non-unique. In general, in the absence of additional information, there is no unique solution to Q in those cases. Nevertheless, one may still want to draw some inferences about Q. To this end, we propose a novel maximum entropy (MaxEnt) approach that estimates Q(x) based only on the available data, namely, P(y). The method has the additional advantage that one does not need to explicitly calculate the Lagrange multipliers. In this paper we develop the approach, for both discrete and continuous probability distributions, and demonstrate its validity. We give an intuitive justification as well, and we illustrate with examples.

Highlights

  • We are often interested in quantitative details about quantities that are difficult or even impossible to measure directly

  • We show that when the variables are discrete, no unique solution exists for Q(x), as the system is underdetermined

  • The results obtained by maximizing S in the previous section can be derived by minimizing a relative entropy (RE) defined above with the discrete distributions, Q and a uniform distribution, U, where the integral in Equation (9) is replaced by a summation over the states in the x space

Read more

Summary

Introduction

We are often interested in quantitative details about quantities that are difficult or even impossible to measure directly. Knowing the quantitative values of the parameters representing microbial interactions is of great interest, both because of their role in development of therapeutic strategies against diseases such as colitis , and for basic understanding, as we have discussed in [3] Inference of these unknown variables from the available data is a subject of a vast literature in diverse disciplines including statistics, information theory, and, machine learning [4,5,6,7]. The challenge is to estimate the distribution of microbial interaction parameters using the distribution of microbial abundances These inference problems can be dealt with by Maximum Entropy (MaxEnt)-based methods that maximize an entropy function subject to constraints provided by the expectation values calculated from measured data [4,5,7,8].

The Problem
Results for Continuous Variables
Discussion
Details of the MaxEnt Calculations for Example 1
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call