Abstract

This article studies a high-dimensional inference problem involving the matrix tensor product of random matrices. This problem generalizes a number of contemporary data science problems including the spiked matrix models used in sparse principal component analysis and covariance estimation and the stochastic block model used in network analysis. The main results are single-letter formulas (i.e., analytical expressions that can be approximated numerically) for the mutual information and the minimum mean-squared error (MMSE) in the Bayes optimal setting where the distributions of all random quantities are known. We provide non-asymptotic bounds and show that our formulas describe exactly the leading order terms in the mutual information and MMSE in the high-dimensional regime where the number of rows n and number of columns d scale with d = O(n <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">α</sup> ) for some α <; 1/20. On the technical side, this article introduces some new techniques for the analysis of high-dimensional matrix-valued signals. Specific contributions include a novel extension of the adaptive interpolation method that uses order-preserving positive semidefinite interpolation paths, and a variance inequality between the overlap and the free energy that is based on continuous-time I-MMSE relations.

Highlights

  • I NFERENCE problems involving the estimation and factorization of large structured matrices play a central role in the data sciences

  • The last decade has witnessed significant progress on theory and algorithms for a variety of models involving low-rank structure, such as the spiked matrix models used in low-rank covariance estimation[1], [2], sparse principal component analysis (PCA) [3], and clustering [4], [5], as well as related models involving sparse graphical structures, such as the stochastic block model (SBM) for community detection [6], [7]

  • While the emphasis on single-letter formulas is standard in areas such as information theory and statistical physics, it differs from some of the other approaches used in the data sciences, which focus instead on order-optimal bounds or rates of convergence

Read more

Summary

INTRODUCTION

I NFERENCE problems involving the estimation and factorization of large structured matrices play a central role in the data sciences. While the emphasis on single-letter formulas is standard in areas such as information theory and statistical physics, it differs from some of the other approaches used in the data sciences, which focus instead on order-optimal bounds or rates of convergence. In this context, the contribution of this article is a rigorous analysis of the fundamental limits for a broad class of problems related to the matrix tensor product (or Kronecker product) of large random matrices with low-dimensional structure.

The Matrix Tensor Product and Related Models
Overview of Contributions
Related Work
Notation
STATEMENT OF MAIN RESULTS
Mutual Information
MMSE Matrix
Overlap Concentration
General Bound Based on Relative Entropy Variance
Properties of Mutual Information and MMSE
Alternative Characterizations
Examples
Proof of Theorem 1
Proof of Theorem 5
ADAPTIVE INTERPOLATION
Interpolation
Lower Bound
Upper Bound
VARIANCE INEQUALITY
Variance Inequality From Pointwise I-MMSE
Extension to Matrix-Valued Setting
OVERLAP CONCENTRATION
Orthogonal Decomposition We begin with the decomposition

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.