Abstract

Abstract Detecting and dealing with spatial autocorrelation (SA) are indispensable steps in analyses of geospatial data. Despite consensus that SA originates from both extrinsic factors and intrinsic interactions, previous studies on regression analysis of spatially autocorrelated data have rarely controlled for intrinsic sources in addition to extrinsic ones to assess ceteris paribus (i.e. causal) effects of interest, with the strict exogeneity assumption that errors containing unexplained variance are uncorrelated with explanatory variables for all observations. This assumption becomes invalid when intrinsic SA is not an external process modelled as errors and needs to be controlled for. Here, we aimed to assess the extent to which not controlling for intrinsic SA negatively affects model performance, specifically in terms of type I error rate and unbiasedness of coefficient estimates, and to identify models that are able to handle these problems. To this end, we applied two categories of regression models that do (intrinsic category) or do not (extrinsic category) explicitly control for intrinsic SA to artificial data generated with both SA sources. These models included the extended spatial Durbin model (ESDM) and its nested models. Four analytic scenarios simulated modelling conditions with increasing complexity of variables omitted during modelling. The two more complex scenarios involved additional violations of strict exogeneity. We found that intrinsic—just as extrinsic—SA can produce incorrect type I error rates, if not explicitly controlled for. Failing to control for intrinsic SA also generated bias in estimates of ceteris paribus effects. However, ESDM from the intrinsic category exhibited consistently good performance in dealing with intrinsic SA across all the scenarios, but suffered other violations of strict exogeneity. Overall, model specification should control for both extrinsic and intrinsic processes generating SA in spatial data to provide reliable type I error rates and unbiased estimates of ceteris paribus effects. Given the likely widespread occurrence in observational spatial data of unknown or unmeasurable processes, ESDM should be a generally preferred starting point to explore the optimal model specification for estimating ceteris paribus effects, with due caution to other violations of strict exogeneity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call