Abstract

BackgroundSpatial data on cases are available either in point form (e.g. longitude/latitude), or aggregated by an administrative region (e.g. zip code or census tract). Statistical methods for spatial data may accommodate either form of data, however the spatial aggregation can affect their performance. Previous work has studied the effect of spatial aggregation on cluster detection methods. Here we consider geographic health data at different levels of spatial resolution, to study the effect of spatial aggregation on disease mapping performance in locating subregions of increased disease risk.MethodsWe implemented a non-parametric disease distance-based mapping (DBM) method to produce a smooth map from spatially aggregated childhood leukaemia data. We then simulated spatial data under controlled conditions to study the effect of spatial aggregation on its performance. We used an evaluation method based on ROC curves to compare performance of DBM across different geographic scales.ResultsApplication of DBM to the leukaemia data illustrates the method as a useful visualization tool. Spatial aggregation produced expected degradation of disease mapping performance. Characteristics of this degradation, however, varied depending on the interaction between the geographic extent of the higher risk area and the level of aggregation. For example, higher risk areas dispersed across several units did not suffer as greatly from aggregation. The choice of centroids also had an impact on the resulting mapping.ConclusionsDBM can be implemented for continuous and discrete spatial data, but the resulting mapping can lose accuracy in the second setting. Investigation of the simulations suggests a complex relationship between performance loss, geographic extent of spatial disturbances and centroid locations. Aggregation of spatial data destroys information and thus impedes efforts to monitor these data for spatial disturbances. The effect of spatial aggregation on cluster detection, disease mapping, and other useful methods in spatial epidemiology is complex and deserves further study.

Highlights

  • Spatial data on cases are available either in point form, or aggregated by an administrative region

  • Measuring the effect of spatial aggregation is the study of determining how the performance of a statistical method is affected when the spatial data are available in a discrete fashion rather than continuous (e.g. Zone improvement plan (ZIP) code vs longitude and latitude), or in one discrete level versus another (e.g. ZIP code vs census tract) [10]

  • Application to upstate New York leukaemia data The upstate New York leukaemia data consists of 592 cases found across 790 discrete census block groups

Read more

Summary

Introduction

Spatial data on cases are available either in point form (e.g. longitude/latitude), or aggregated by an administrative region (e.g. zip code or census tract). Measuring the effect of spatial aggregation is the study of determining how the performance of a statistical method is affected when the spatial data are available in a discrete fashion rather than continuous (e.g. ZIP code vs longitude and latitude), or in one discrete level versus another (e.g. ZIP code vs census tract) [10]. This effect is termed effect of scale or effect of discretisation, and is a particular aspect of the Modifiable Areal Unit Problem [10]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call