Network Imputation for a Spatial Autoregression Model with Incomplete Data

Zhimeng Sun,Hansheng Wang

doi:10.2139/ssrn.3249837

Abstract

Researchers typically encounter missing data in practice and have developed various imputation methods. However, the existing methods are mainly developed for independent data and the assumption of independence disregards the connections of units through various social relationships (e.g., friendship, follower-followee relationship). In fact, the observed responses from connected friends should provide valuable information for missing responses. This factor motivates us to conduct imputation in this paper by borrowing information from connected friends using a network structure. With the missing at random assumption and using observed information only, we propose a partial likelihood approach and develop the corresponding maximum partial likelihood estimator (MPLE). The estimator’s consistency and asymptotic normality are established. Using the MPLE, we then develop a novel regression imputation method. The method utilizes both auxiliary information and connected complete units (i.e., network information); using the imputed data, we can compute the sample mean of the responses. We show this method to be consistent and asymptotically normal. Compared with the imputation method using auxiliary information only (i.e., ignoring network information), the proposed estimator is statistically more efficient. Extensive simulation studies are conducted to demonstrate its finite sample performance. We then analyze a real example about QQ in mainland China for illustration.

Full Text