Abstract
A method for estimating the Shannon differential entropy of multidimensional random variables using independent samples is described. The method is based on decomposing the distribution into a product of marginal distributions and joint dependency, also known as the copula. The entropy of marginals is estimated using one-dimensional methods. The entropy of the copula, which always has a compact support, is estimated recursively by splitting the data along statistically dependent dimensions. The method can be applied both for distributions with compact and non-compact supports, which is imperative when the support is not known or of a mixed type (in different dimensions). At high dimensions (larger than 20), numerical examples demonstrate that our method is not only more accurate, but also significantly more efficient than existing approaches.
Highlights
Differential entropy (DE) has wide applications in a range of fields including signal processing, machine learning, and feature selection [1,2,3]
If Singular value Decomposition (SVD) converges the distribution into a product of independent 1D variables, the copula is close to 1 and the method will be highly exact after a single iteration
We presented a new algorithm for estimating the differential entropy of high-dimensional distributions using independent samples
Summary
Differential entropy (DE) has wide applications in a range of fields including signal processing, machine learning, and feature selection [1,2,3]. Since two variables are independent if and only if their mutual information vanishes, accurate and efficient entropy estimation algorithms are highly advantageous [5] Another important application of DE estimation is quantifying order in out-of-equilibrium physical systems [6,7]. In 1D, the most straight-forward method is to partition the support of the distribution into bins and either calculate the entropy of the histogram or use it for plug-in estimates [8,10,11]. This amounts to approximating p( x ) as a piece-wise constant function (i.e., assuming that the distribution is uniform in each subset in the partition).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.