This paper is concerned with jointly recovering n node variables $\left \{{ x_{i}}\right \} _{1\leq i\leq n}$ from a collection of pairwise difference measurements. Imagine we acquire a few observations taking the form of $x_{i}-x_{j}$ ; the observation pattern is represented by a measurement graph $\mathcal {G}$ with an edge set $\mathcal {E}$ , such that $x_{i}-x_{j}$ is observed if and only if $(i,j)\in \mathcal {E}$ . To account for noisy measurements in a general manner, we model the data acquisition process by a set of channels with given input/output transition measures. Employing information-theoretic tools applied to channel decoding problems, we develop a unified framework to characterize the fundamental recovery criterion, which accommodates general graph structures, alphabet sizes, and channel transition measures. In particular, our results isolate a family of minimum channel divergence measures to characterize the degree of measurement corruption, which together with the size of the minimum cut of $\mathcal {G}$ dictates the feasibility of exact information recovery. For various homogeneous graphs, the recovery condition depends almost only on the edge sparsity of the measurement graph irrespective of other graphical metrics; alternatively, the minimum sample complexity required for these graphs scales like $({n\log n})/({\mathsf {Hel}_{1/2}^{\min }})$ for certain information metric $\mathsf {Hel}_{1/2}^{\min }$ defined in the main text, as long as the alphabet size is not super-polynomial in n . We apply our general theory to three concrete applications, including the stochastic block model, the random corruption model, and the haplotype assembly problem. Our theory leads to orderwise tight recovery conditions for all these scenarios.
Read full abstract