Efficient estimation methods for simultaneous autoregressive (SAR) models with missing data in the response variable have been well explored in the literature. A common practice is introducing measurement error into SAR models to separate the noise component from the spatial process. However, prior studies have not considered incorporating measurement error into SAR models with missing data. Maximum likelihood estimation for such models, especially with large datasets, poses significant computational challenges. This paper proposes an efficient likelihood-based estimation method, the marginal maximum likelihood (ML), for estimating SAR models on large datasets with measurement errors and a high percentage of missing data in the response variable. The spatial autoregressive model (SAM) and the spatial error model (SEM), two popular SAR model types, are considered. The missing data mechanism is assumed to follow a missing-at-random (MAR) pattern. We propose a fast method for marginal ML estimation with a computational complexity of O(n3/2), where n is the total number of observations. This complexity applies when the spatial weight matrix is constructed based on a local neighbourhood structure. The effectiveness of the proposed methods is demonstrated through simulations and real-world data applications.
Read full abstract