Transforming geographic scale: a comparison of combined population and areal weighting to other interpolation methods

Elaine Hallisey,Grete Wilt,Lucy Peipins,Natasha Buchanan Lunsford,Andrew Berens,Eric Tai,Barry Flanagan,Brian Lewis,Shannon Graham

doi:10.1186/s12942-017-0102-z

Abstract

BackgroundTransforming spatial data from one scale to another is a challenge in geographic analysis. As part of a larger, primary study to determine a possible association between travel barriers to pediatric cancer facilities and adolescent cancer mortality across the United States, we examined methods to estimate mortality within zones at varying distances from these facilities: (1) geographic centroid assignment, (2) population-weighted centroid assignment, (3) simple areal weighting, (4) combined population and areal weighting, and (5) geostatistical areal interpolation. For the primary study, we used county mortality counts from the National Center for Health Statistics (NCHS) and population data by census tract for the United States to estimate zone mortality. In this paper, to evaluate the five mortality estimation methods, we employed address-level mortality data from the state of Georgia in conjunction with census data. Our objective here is to identify the simplest method that returns accurate mortality estimates.ResultsThe distribution of Georgia county adolescent cancer mortality counts mirrors the Poisson distribution of the NCHS counts for the U.S. Likewise, zone value patterns, along with the error measures of hierarchy and fit, are similar for the state and the nation. Therefore, Georgia data are suitable for methods testing. The mean absolute value arithmetic differences between the observed counts for Georgia and the five methods were 5.50, 5.00, 4.17, 2.74, and 3.43, respectively. Comparing the methods through paired t-tests of absolute value arithmetic differences showed no statistical difference among the methods. However, we found a strong positive correlation (r = 0.63) between estimated Georgia mortality rates and combined weighting rates at zone level. Most importantly, Bland–Altman plots indicated acceptable agreement between paired arithmetic differences of Georgia rates and combined population and areal weighting rates.ConclusionsThis research contributes to the literature on areal interpolation, demonstrating that combined population and areal weighting, compared to other tested methods, returns the most accurate estimates of mortality in transforming small counts by county to aggregated counts for large, non-standard study zones. This conceptually simple cartographic method should be of interest to public health practitioners and researchers limited to analysis of data for relatively large enumeration units.

Highlights

Transforming spatial data from one scale to another is a challenge in geographic analysis
We developed and tested a conceptually simple technique, combined population and areal weighting, which merges a dasymetric population weighting with areal weighting
Distribution of adolescent cancer county mortality counts: Georgia versus the U.S Histograms of the distribution of county mortality counts reveal a pattern in Georgia similar to that of the U.S (Fig. 6)

Summary

Introduction

Transforming spatial data from one scale to another is a challenge in geographic analysis. Geographic boundaries, such as counties, are unsuitable in terms of the units needed for meaningful data analysis. This spatial misalignment of data is referred to as the change-ofsupport problem, which is concerned with inferences about the value of any particular variable at an enumeration unit different from that at which data were collected [2, 3]. An analyst who requires data for a non-standard enumeration unit, say a zone surrounding a U.S hospital (target zone), must transform data collected at another zone level, such as a group of U.S census tracts (source zones), to match the boundaries of the zone surrounding the hospital.

Methods

Results

Discussion

Conclusion