Abstract
A new nonparametric test is proposed for the multivariate two-sample problem. Similar to Rosenbaum’s cross-match test, each observation is considered to be a vertex of a complete undirected weighted graph; interpoint distances are edge weights. A minimum-weight, r-regular subgraph is constructed, and the mean cross-count test statistic is equal to the number of edges in the subgraph containing one observation from the first group and one from the second, divided by r. Unequal distributions will tend to result in fewer edges that connect vertices between different groups. The mean cross-count test is sensitive to a wide range of distribution differences and has impressive power characteristics. We derive the first and second moments of the mean cross-count test, and note that simulation studies suggest this test statistic is asymptotically normal regardless of underlying data distributions. A small simulation study compares the power of the mean cross-count test to Hotelling’s T2 test and to the cross-match test. This new test is a more powerful generalization of Rosenbaum’s test (the cross-match test is the case r = 1) and constitutes a noteworthy addition to the class of multivariate, nonparametric two-sample tests.
Highlights
1.1 Objective Consider N = m + n independent multivariate observations Y1, ..., Ym and Ym + 1, ..., YN, where each Yi is drawn from distribution F for 1 ≤ i ≤ m and from distribution G for m + 1 ≤ i ≤ N
4 Conclusions The mean cross-count test is a powerful, non-parametric multivariate two-sample test that is applicable to any case where a notion of distance between observations exists
While this paper considers only location shifts, other simulations show that the mean cross-count (MCC) test has power in a variety of alternative cases as well
Summary
This test should have sufficient power to be useful for applications
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have