Positional Accuracy of Spatial Data: Non‐Normal Distributions and a Critique of the National Standard for Spatial Data Accuracy

Paul A Zandbergen

doi:10.1111/j.1467-9671.2008.01088.x

Abstract

AbstractSpatial data quality is a paramount concern in all GIS applications. Existing spatial data accuracy standards, including the National Standard for Spatial Data Accuracy (NSSDA) used in the United States, commonly assume the positional error of spatial data is normally distributed. This research has characterized the distribution of the positional error in four types of spatial data: GPS locations, street geocoding, TIGER roads, and LIDAR elevation data. The positional error in GPS locations can be approximated with a Rayleigh distribution, the positional error in street geocoding and TIGER roads can be approximated with a log‐normal distribution, and the positional error in LIDAR elevation data can be approximated with a normal distribution of the original vertical error values after removal of a small number of outliers. For all four data types considered, however, these solutions are only approximations, and some evidence of non‐stationary behavior resulting in lack of normality was observed in all four datasets. Monte‐Carlo simulation of the robustness of accuracy statistics revealed that the conventional 100% Root Mean Square Error (RMSE) statistic is not reliable for non‐normal distributions. Some degree of data trimming is recommended through the use of 90% and 95% RMSE statistics. Percentiles, however, are not very robust as single positional accuracy statistics. The non‐normal distribution of positional errors in spatial data has implications for spatial data accuracy standards and error propagation modeling. Specific recommendations are formulated for revisions of the NSSDA.

Full Text