Gumbel based p-value approximations for spatial scan statistics.

Allyson M Abrams,Martin Kulldorff,Ken Kleinman

doi:10.1186/1476-072x-9-61

Allyson M Abrams, Martin Kulldorff + Show 1 more

Open Access

https://doi.org/10.1186/1476-072x-9-61

Copy DOI

Abstract

BackgroundThe spatial and space-time scan statistics are commonly applied for the detection of geographical disease clusters. Monte Carlo hypothesis testing is typically used to test whether the geographical clusters are statistically significant as there is no known way to calculate the null distribution analytically. In Monte Carlo hypothesis testing, simulated random data are generated multiple times under the null hypothesis, and the p-value is r/(R + 1), where R is the number of simulated random replicates of the data and r is the rank of the test statistic from the real data compared to the same test statistics calculated from each of the random data sets. A drawback to this powerful technique is that each additional digit of p-value precision requires ten times as many replicated datasets, and the additional processing can lead to excessive run times.ResultsWe propose a new method for obtaining more precise p-values with a given number of replicates. The collection of test statistics from the random replicates is used to estimate the true distribution of the test statistic under the null hypothesis by fitting a continuous distribution to these observations. The choice of distribution is critical, and for the spatial and space-time scan statistics, the extreme value Gumbel distribution performs very well while the gamma, normal and lognormal distributions perform poorly. From the fitted Gumbel distribution, we show that it is possible to estimate the analytical p-value with great precision even when the test statistic is far out in the tail beyond any of the test statistics observed in the simulated replicates. In addition, Gumbel-based rejection probabilities have smaller variability than Monte Carlo-based rejection probabilities, suggesting that the proposed approach may result in greater power than the true Monte Carlo hypothesis test for a given number of replicates.ConclusionsFor large data sets, it is often advantageous to replace computer intensive Monte Carlo hypothesis testing with this new method of fitting a Gumbel distribution to random data sets generated under the null, in order to reduce computation time and obtain much more precise p-values and slightly higher statistical power.

Highlights

The spatial and space-time scan statistics are commonly applied for the detection of geographical disease clusters
One frequently used method for cluster detection is the spatial scan statistic [1,2,3] and the related space-time scan statistic [4]. This method has been used to study the geography of infectious diseases such as malaria [5], vector borne diseases such as West Nile Virus [6], many different forms of cancer [7,8,9,10,11], low birth weight [12], syndromic surveillance [13,14,15,16,17], and bovine spongiform encephalopathy [18], among many other diseases
A likelihood ratio is calculated for the data corresponding to each window location and size and the spatial scan statistic is the maximum of these likelihood ratios

Summary

Introduction

The spatial and space-time scan statistics are commonly applied for the detection of geographical disease clusters. One frequently used method for cluster detection is the spatial scan statistic [1,2,3] and the related space-time scan statistic [4] This method has been used to study the geography of infectious diseases such as malaria [5], vector borne diseases such as West Nile Virus [6], many different forms of cancer [7,8,9,10,11], low birth weight [12], syndromic surveillance [13,14,15,16,17], and bovine spongiform encephalopathy [18], among many other diseases. P-values for scan statistics are usually obtained using Monte Carlo hypothesis testing [19]

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Health Geographics	Publication Date: Jan 1, 2010
Citations: 75	License type: cc-by

R Discovery Prime

R Discovery Prime

Gumbel based p-value approximations for spatial scan statistics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Health Geographics

Lead the way for us

Similar Papers

P-value approximations for spatial scan statistics using extreme value distributions.
Inkyung Jung ... Goeun Park
Statistics in medicine | VOL. 34
Inkyung Jung, et. al.Inkyung Jung ... Goeun Park
24 Oct 2014
Statistics in medicine | VOL. 34

Spatio-Temporal Distribution Characteristics and Trajectory Similarity Analysis of Tuberculosis in Beijing, China.
Lan Li ... Fu Ren
International Journal of Environmental Research and Public Health | VOL. 13
Lan Li, et. al.Lan Li ... Fu Ren
01 Mar 2016
International Journal of Environmental Research and Public Health | VOL. 13

Clusters of leprosy transmission and of late diagnosis in a highly endemic area in Brazil: focus on different spatial analysis approaches
Carlos H Alencar ... Joachim Richter
Tropical Medicine & International Health | VOL. 17
Carlos H Alencar, et. al.Carlos H Alencar ... Joachim Richter
16 Jan 2012
Tropical Medicine & International Health | VOL. 17

A flexibly shaped space-time scan statistic for disease outbreak detection and monitoring
Kunihiko Takahashi ... Katherine Yih
International Journal of Health Geographics | VOL. 7
Kunihiko Takahashi, et. al.Kunihiko Takahashi ... Katherine Yih
01 Jan 2008
International Journal of Health Geographics | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Gumbel based p-value approximations for spatial scan statistics.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: International Journal of Health Geographics