Abstract

Generalized linear models (GLMs) are used in high-dimensional machine learning, statistics, communications, and signal processing. In this paper we analyze GLMs when the data matrix is random, as relevant in problems such as compressed sensing, error-correcting codes, or benchmark models in neural networks. We evaluate the mutual information (or "free entropy") from which we deduce the Bayes-optimal estimation and generalization errors. Our analysis applies to the high-dimensional limit where both the number of samples and the dimension are large and their ratio is fixed. Nonrigorous predictions for the optimal errors existed for special cases of GLMs, e.g., for the perceptron, in the field of statistical physics based on the so-called replica method. Our present paper rigorously establishes those decades-old conjectures and brings forward their algorithmic interpretation in terms of performance of the generalized approximate message-passing algorithm. Furthermore, we tightly characterize, for many learning problems, regions of parameters for which this algorithm achieves the optimal performance and locate the associated sharp phase transitions separating learnable and nonlearnable regions. We believe that this random version of GLMs can serve as a challenging benchmark for multipurpose algorithms.

Highlights

  • Generalized linear models (GLMs) are used in high-dimensional machine learning, statistics, communications, and signal processing

  • Nonrigorous predictions for the optimal errors existed for special cases of GLMs, e.g., for the perceptron, in the field of statistical physics based on the socalled replica method

  • A s datasets grow larger and more complex, modern data analysis requires solving high-dimensional estimation problems with very many parameters. Developing algorithms for this task and understanding their limitations have become a major challenge in computer science, machine learning, statistics, signal processing, communications, and related fields. We address this challenge in the case of generalized linear estimation models (GLMs) (1, 2) where data are generated as follows: Given an n-dimensional vector X∗, hidden to statisticians, they observe instead an m-dimensional vector Y where each component reads

Read more

Summary

Main Results

For the random GLM problem as defined in the Introduction, the optimal way to estimate the ground-truth signal/weights X∗ relies on its posterior probability distribution n m

Pout that an output
Application to Learning and Inference
Methods and Proofs
Define fn
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call