Abstract
Researchers apply scan statistics to test for unusually large clusters of events within a time window of specified length w, or alternatively an unusually small window w that contains a specified number of events. In some cases, the researcher is interested in testing for a range of specified window lengths, or a set of several specified number of events k (cluster sizes). In this paper, we derive accurate approximations for the joint distributions of scan statistics for a range of values of w, or of k, that can be used to set an experiment-wide level of significance that takes into account the multiple comparisons involved. We use these methods to compare different ways of choosing the window sizes for the different cluster sizes. One special case is a multiple comparison procedure based on a generalized likelihood ratio test (GLRT) for a range of window sizes. We compare the power of the GLRT with another method for allocating the window sizes. We find that the GLRT is sensitive for very small window sizes at the expense of moderate and larger window sizes. We illustrate these results on two examples, one involving clustering of translocation breakpoints in DNA, and the other involving disease clusters.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have