Abstract
In this article, we propose new multivariate two-sample tests based on nearest neighbor type coincidences. While several existing tests for the multivariate two-sample problem perform poorly for high dimensional data, and many of them are not applicable when the dimension exceeds the sample size, these proposed tests can be conveniently used in the high dimension low sample size (HDLSS) situations. Unlike Schilling (1986) [26] and Henze’s (1988) test based on nearest neighbors, under fairly general conditions, these new tests are found to be consistent in HDLSS asymptotic regime, where the sample size remains fixed and the dimension grows to infinity. Several high dimensional simulated and real data sets are analyzed to compare their empirical performance with some popular two-sample tests available in the literature. We further investigate the behavior of these proposed tests in classical asymptotic regime, where the dimension of the data remains fixed and the sample size tends to infinity. In such cases, they turn out to be asymptotically distribution-free and consistent under general alternatives.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have