Edgeworth expansions for network moments

Yuan Zhang,Dong Xia

doi:10.1214/21-aos2125

Abstract

Network method of moments (Ann. Statist. 39 (2011) 2280–2301) is an important tool for nonparametric network inference. However, there has been little investigation on accurate descriptions of the sampling distributions of network moment statistics. In this paper, we present the first higher-order accurate approximation to the sampling CDF of a studentized network moment by Edgeworth expansion. In sharp contrast to classical literature on noiseless U-statistics, we show that the Edgeworth expansion of a network moment statistic as a noisy U-statistic can achieve higher-order accuracy without nonlattice or smoothness assumptions but just requiring weak regularity conditions. Behind this result is our surprising discovery that the two typically-hated factors in network analysis, namely, sparsity and edgewise observational errors, jointly play a blessing role, contributing a crucial self-smoothing effect in the network moment statistic and making it analytically tractable. Our assumptions match the minimum requirements in related literature. For sparse networks, our theory shows that our empirical Edgeworth expansion and a simple normal approximation both achieve the same gradually depreciating Berry–Esseen-type bound as the network becomes sparser. This result also significantly refines the best previous theoretical result. For practitioners, our empirical Edgeworth expansion is highly accurate and computationally efficient. It is also easy to implement and convenient for parallel computing. We demonstrate the clear advantage of our method by several comprehensive simulation studies. As a byproduct, we also provide a finite-sample analysis of the network jackknife. We showcase three applications of our results in network inference. We prove, to our knowledge, the first theoretical guarantee of higher-order accuracy for some network bootstrap schemes, and moreover, the first theoretical guidance for selecting the subsample size for network subsampling. We also derive a one-sample test and the Cornish–Fisher confidence interval for a given moment with higher-order accurate controls of confidence level and type I error, respectively.

Full Text