Abstract

BackgroundA semiparametric density ratio method which borrows strength from two or more samples can be applied to moving window of variable size in cluster detection. The method requires neither the prior knowledge of the underlying distribution nor the number of cases before scanning. In this paper, the semiparametric cluster detection procedure is combined with Storey's q-value, a type of controlling false discovery rate (FDR) method, to take into account the multiple testing problem induced by the overlapping scanning windows.ResultsIt is shown by simulations that for binary data, using Kulldorff's Northeastern benchmark data, the semiparametric method and Kulldorff's method performs similarly well. When the data are not binary, the semiparametric methodology still works in many cases, but Kulldorff's method requires the choices of a correct probability model, namely the correct scan statistic, in order to achieve comparable power as the semiparametric method achieves. Kulldorff's method with an inappropriate probability model may lose power.ConclusionsThe semiparametric method proposed in the paper can achieve good power when detecting localized cluster. The method does not require a specific distributional assumption other than the tilt function. In addition, it is possible to adapt other scan schemes (e.g., elliptic spatial scan, flexible shape scan) to search for clusters as well.

Highlights

  • A semiparametric density ratio method which borrows strength from two or more samples can be applied to moving window of variable size in cluster detection

  • The semiparametric method proposed in the paper can achieve good power when detecting localized cluster

  • We extended the original semiparametric cluster detection procedure by incorporating Storey’s q-value method [19,20], a type of false discovery rate (FDR) methodology [21], to take into account the multiple testing problem inherent in cluster detection

Read more

Summary

Introduction

A semiparametric density ratio method which borrows strength from two or more samples can be applied to moving window of variable size in cluster detection. The semiparametric cluster detection procedure is combined with Storey’s q-value, a type of controlling false discovery rate (FDR) method, to take into account the multiple testing problem induced by the overlapping scanning windows. The cluster can be defined as a certain spatial or temporal subregion where the the probability distribution of an event is different from that in the rest of the region. A cluster is a subregion where the behavior of an observable is different from the behavior of the observable in the rest of the region. A subregion comprised of several counties with higher rate of one certain type of cancer than the rate of other counties in the study region defines a cancer cluster [1]. See Glaz and Balakrishnan (1999), Glaz et al (2001) [2,3]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call