We study the effect of the quality and quantity of side information on the recovery of a hidden community of size K = o(n) in a graph of size n. Side information for each node in the graph is modeled by a random vector in which either the vector dimension or the LLR of each component with respect to node labels is independent of n. These two models represent the variation in quality and quantity of side information. Under maximum likelihood detection, we calculate tight necessary and sufficient conditions for exact recovery of the labels. We demonstrate how side information needs to evolve with n in terms of either its quantity, or quality, to improve the exact recovery threshold. A similar set of results are obtained for weak recovery. Under belief propagation, tight necessary and sufficient conditions for weak recovery are calculated when the LLRs are constant, and sufficient conditions when the LLRs vary with n. Moreover, we design and analyze a local voting procedure using side information that can achieve exact recovery when applied after belief propagation.
Read full abstract