Abstract

A basic problem in modern multivariate analysis is testing the equality of two mean vectors in settings where the dimension p increases with the sample size n. This paper proposes a robust two-sample test for high-dimensional data against sparse and strong alternatives, in which the mean vectors of the populations differ in only a few dimensions, but the magnitude of the differences is large. The test is based on trimmed means and robust precision matrix estimators. The asymptotic joint distribution of the trimmed means is established, and the proposed test statistic is shown to have a Gumbel distribution in the limit. Simulation studies suggest that the numerical performance of the proposed test is comparable to that of non-robust tests for uncontaminated data. For cell-wise contaminated data, it outperforms non-robust tests. An illustration involves biomarker identification in an Alzheimer’s disease dataset.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call