Abstract
Statistical analysis of network is an active research area and the literature counts a lot of papers concerned with network models and statistical analysis of networks. However, very few papers deal with missing data in network analysis and we reckon that, in practice, networks are often observed with missing values. In this paper we focus on the Stochastic Block Model with valued edges and consider a MCAR setting by assuming that every dyad (pair of nodes) is sampled identically and independently of the others with probability $\rho >0$. We prove that maximum likelihood estimators and its variational approximations are consistent and asymptotically normal in the presence of missing data as soon as the sampling probability $\rho $ satisfies $\rho \gg \log (n)/n$.
Highlights
For the last decade, statistical network analyses has been a very active research topic and the statistical modeling of networks has found many applications in social sciences and biology for example Aicher et al (2014), Barbillon et al (2015), Mariadassou et al (2010), Wasserman and Faust (1994) and Zachary (1977).Many random graphs models have been widely studied, either from a theoretical or an empirical point of view
In Celisse et al (2012), consistency of MLE and VE is proven but asymptotic normality requires that the estimators converges at rate at least n−1, which is not proven in the paper, some results were available for some particular cases
According to Equation (2.2), if the sampling design is missing completely at random (MCAR), maximising pθ,ψ(yo, z, r) or pθ,ψ(yo, r) in θ is equivalent to maximising pθ(yo) in θ, this corresponds to the ignorability notion defined in Rubin (1976)
Summary
Statistical network analyses has been a very active research topic and the statistical modeling of networks has found many applications in social sciences and biology for example Aicher et al (2014), Barbillon et al (2015), Mariadassou et al (2010), Wasserman and Faust (1994) and Zachary (1977). In Celisse et al (2012), consistency of MLE and VE is proven but asymptotic normality requires that the estimators converges at rate at least n−1, which is not proven in the paper, some results were available for some particular cases (affiliation for example). There is a strong asymmetry between the presence of an edge and its absence: the lack of proof that an edge exists is taken as proof that the edge does not exist and edges with uncertain status are considered as non existent in the graph This is the strategy adopted in most sparse asymptotic settings where the density of edges goes to 0 asymptotically (Bickel et al, 2013). Technical lemmas and details of the proofs are available in the appendices
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.