A Two-Sample Test for Equality of Means in High Dimension

Karl Bruce Gregory,Raymond J Carroll,Veerabhadran Baladandayuthapani,Soumendra N Lahiri

doi:10.1080/01621459.2014.934826

Karl Bruce Gregory, Raymond J Carroll + Show 2 more

Open Access

https://doi.org/10.1080/01621459.2014.934826

Copy DOI

Abstract

We develop a test statistic for testing the equality of two population mean vectors in the “large-p-small-n” setting. Such a test must surmount the rank-deficiency of the sample covariance matrix, which breaks down the classic Hotelling T2 test. The proposed procedure, called the generalized component test, avoids full estimation of the covariance matrix by assuming that the p components admit a logical ordering such that the dependence between components is related to their displacement. The test is shown to be competitive with other recently developed methods under ARMA and long-range dependence structures and to achieve superior power for heavy-tailed data. The test does not assume equality of covariance matrices between the two populations, is robust to heteroscedasticity in the component variances, and requires very little computation time, which allows its use in settings with very large p. An analysis of mitochondrial calcium concentration in mouse cardiac muscles over time and of copy number variations in a glioblastoma multiforme dataset from The Cancer Genome Atlas are carried out to illustrate the test. Supplementary materials for this article are available online.

Highlights

In many applications it is desirable to test whether the means of high-dimensional random vectors are the same in two populations
The following conditions are assumed in deriving the asymptotic distribution of the test statistic Tn
The following theorem establishes the asymptotic normality of the test statistic under the appropriate centering and scaling

Summary

Introduction

In many applications it is desirable to test whether the means of high-dimensional random vectors are the same in two populations. Bai & Saranadasa (1996) presented a test statistic which uses only the trace of the sample covariance matrix and performs well when the random vectors of each population can be expressed as linear transformations of zero-mean i.i.d. random vectors with identity covariance matrices. Dense-but-weak signal settings do exist, for example in the analysis of copy number variations, where mildly elevated or reduced numbers of DNA segment copies in cancer patients are believed to occur over regions of the chromosome rather than at isolated points (Olshen et al (2004), Baladandayuthapani et al (2010)) It is for such cases that our test is designed. Full details for the proofs may be found in the Supplementary Material

Test Statistic

Main Results

Technical Details

Simulation Studies

Performance under normality

Effect of skewness

Effect of heavy-tailedness

Effect of heteroscedasticity

Effect of unequal covariance matrices

Copy Number Variation Example

Mitochondrial Calcium Concentration

Conclusions

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of the American Statistical Association	Publication Date: Apr 3, 2015
Citations: 87	License type: cc-by

R Discovery Prime

R Discovery Prime

A Two-Sample Test for Equality of Means in High Dimension

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of the American Statistical Association

Lead the way for us

Similar Papers

A Powerful Bayesian Test for Equality of Means in High Dimensions
Roger S Zoh ... Bani K Mallick
Journal of the American Statistical Association | VOL. 113
Roger S Zoh, et. al.Roger S Zoh ... Bani K Mallick
06 Aug 2018
Journal of the American Statistical Association | VOL. 113

CHAPTER VIII - Tests Concerning Covariance Matrices and Mean Vectors
Narayan C Giri
Multivariate Statistical Inference | VOL. -
Narayan C GiriNarayan C Giri
01 Jan 1976
Multivariate Statistical Inference | VOL. -

16 Likelihood ratio tests for mean vectors and covariance matrices
... Jack C Lee
Handbook of Statistics | VOL. 1
, et. al. ... Jack C Lee
01 Jan 1980
Handbook of Statistics | VOL. 1

Applied Multivariate Statistical Analysis.
A R Johnson ... D W Wichern
Biometrics | VOL. 44
A R Johnson, et. al.A R Johnson ... D W Wichern
01 Sep 1988
Biometrics | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Two-Sample Test for Equality of Means in High Dimension

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of the American Statistical Association