Abstract

BackgroundTo reduce the number of non-geocoded cases researchers and organizations sometimes include cases geocoded to postal code centroids along with cases geocoded with the greater precision of a full street address. Some analysts then use the postal code to assign information to the cases from finer-level geographies such as a census tract. Assignment is commonly completed using either a postal centroid or by a geographical imputation method which assigns a location by using both the demographic characteristics of the case and the population characteristics of the postal delivery area. To date no systematic evaluation of geographical imputation methods ("geo-imputation") has been completed. The objective of this study was to determine the accuracy of census tract assignment using geo-imputation.MethodsUsing a large dataset of breast, prostate and colorectal cancer cases reported to the New Jersey Cancer Registry, we determined how often cases were assigned to the correct census tract using alternate strategies of demographic based geo-imputation, and using assignments obtained from postal code centroids. Assignment accuracy was measured by comparing the tract assigned with the tract originally identified from the full street address.ResultsAssigning cases to census tracts using the race/ethnicity population distribution within a postal code resulted in more correctly assigned cases than when using postal code centroids. The addition of age characteristics increased the match rates even further. Match rates were highly dependent on both the geographic distribution of race/ethnicity groups and population density.ConclusionGeo-imputation appears to offer some advantages and no serious drawbacks as compared with the alternative of assigning cases to census tracts based on postal code centroids. For a specific analysis, researchers will still need to consider the potential impact of geocoding quality on their results and evaluate the possibility that it might introduce geographical bias.

Highlights

  • To reduce the number of non-geocoded cases researchers and organizations sometimes include cases geocoded to postal code centroids along with cases geocoded with the greater precision of a full street address

  • Recent studies have shown that individual addresses not able to be geocoded to a full street address are not random and are more likely to be located in rural areas because of the disproportionate number of rural route addresses, post office box addresses, unofficial street names and streets missing from geocoding reference files [5,10,11,12,13,14,15,16]

  • For the entire study population, cases assigned to census tracts based on geographic centroids had correct matches 20.7% of the time compared with 25.9% when using population weighted centroids, and 27.7% when geo-imputation was based on a distribution of population by race/ethnicityage (Table 2)

Read more

Summary

Introduction

Assignment is commonly completed using either a postal centroid or by a geographical imputation method which assigns a location by using both the demographic characteristics of the case and the population characteristics of the postal delivery area. The process of assigning geographic information based on a street address, commonly referred to as geocoding, is increasingly being used in health research to assess geographic clustering or associations between health outcomes and area-based socioeconomic and/or environmental characteristics [1,2,3,4]. Researchers exercise options to exclude such cases, or to include them and assign locations with a lower level of spatial precision. Rather than ignore this problem as has been done in the (page number not for citation purposes).

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call