In economics and social science, network data are regularly observed, and a thorough understanding of the network community structure facilitates the comprehension of economic patterns and activities. Consider an undirected network with n nodes and K communities. We model the network using the Degree-Corrected Mixed-Membership (DCMM) model, where for each node i=1,2,…,n, there exists a membership vector πi=(πi(1),πi(2),…,πi(K))′, where πi(k) is the weight that node i puts on community k, 1≤k≤K. In comparison to the well-known stochastic block model (SBM), the DCMM permits both severe degree heterogeneity and mixed memberships, making it more realistic and general. We present an efficient approach, Mixed-SCORE, for estimating the mixed membership vectors of all nodes and the other DCMM parameters. This approach is inspired by the discovery of a delicate simplex structure in the spectral domain. We derive explicit error rates for the Mixed-SCORE algorithm and demonstrate that it is rate-optimal over a broad parameter space. Our findings provide a novel statistical tool for network community analysis, which can be used to understand network formations, extract nodal features, identify unobserved covariates in dyadic regressions, and estimate peer effects. We applied Mixed-SCORE to a political blog network, two trade networks, a co-authorship network, and a citee network, and obtained interpretable results.
Read full abstract