In the analysis of spatially referenced data, interest often focuses not on prediction of the spatially indexed variable itself, but on boundary analysis, that is, the determination of boundaries on the map that separate areas of higher and lower values. Existing boundary analysis methods are sometimes generically referred to as wombling, after a foundational article by Womble (1951). When data are available at point level (e.g., exact latitude and longitude of disease cases), such boundaries are most naturally obtained by locating the points of steepest ascent or descent on the fitted spatial surface (Banerjee, Gelfand, and Sirmans 2003). In this article, we propose related methods for areal data (i.e., data which consist only of sums or averages over geopolitical regions). Such methods are valuable in determining boundaries for data sets that, perhaps due to confidentiality concerns, are available only in ecological (aggregated) format, or are only collected this way (e.g., delivery of health‐care or cost information). After a brief review of existing algorithmic techniques (including that implemented in the commercial software BoundarySeer), we propose a fully model‐based framework for areal wombling, using Bayesian hierarchical models with posterior summaries computed using Markov chain Monte Carlo methods. We explore the suitability of various existing hierarchical and spatial software packages (notably S‐plus and WinBUGS) to the task, and show the approach's superiority over existing nonstochastic alternatives, both in terms of utility and average mean square error behavior. We also illustrate our methods (as well as the solution of advanced modeling issues such as simultaneous inference) using colorectal cancer late detection data collected at the county level in the state of Minnesota.
Read full abstract