Abstract

The purpose of this article is to propose a test for two-sample location problem in high-dimensional data. In general highdimensional case, the data dimension can be much larger than the sample size and the underlying distribution may be far from normal. Existing tests requiring explicit relationship between the data dimension and sample size or designed for multivariate normal distributions may lose power significantly and even yield type I error rates strayed from nominal levels. To overcome this issue, we propose an adaptive group p-values combination test which is robust against both high dimensionality and normality. Simulation studies show that the proposed test controls type I error rates correctly and outperforms some existing tests in most situations. An Ageing Human Brain Microarray data are used to further exemplify the method.

Highlights

  • 1, 2) are two independent random samples of sizes n1 and n2, from m-variate distributions μ2) with m-variate location parameters μ1 and μ2, respectively

  • We aim to propose an adaptive group p-values combination test(AGCP) by optimizing the significant evidence of GCP obtained on each pair of a set of candidate thresholds applied to two sample location problem for arbitrary dimensional data since it is only based on marginal test statistics and poses no demands on the dimensionality

  • We show that the proposed test outperforms some competing multivariate tests with respect to the type I error rate and power in most scenarios

Read more

Summary

Introduction

1, 2) are two independent random samples of sizes n1 and n2, from m-variate distributions μ2) with m-variate location parameters μ1 and μ2, respectively. A reason for this phenomenon is that Hotelling’s T2 test contains the inverse of sample covariance matrix which may not converge to the population covariance matrix when m is close to n or even is undefined when m > n To address this issue, under the assumption of equal covariance matrix, Bai and Saranadasa proposed a new test by removing Sn−1 from the Hotelling’s T2 test. Under the assumption of equal covariance matrix, Bai and Saranadasa proposed a new test by removing Sn−1 from the Hotelling’s T2 test They derived the asymptotic normality of the test statistic when m and n are of the same order. In the microarray data, tens of thousands of genes are observed on tens of hundreds of samples

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call