Abstract

This article is concerned with simultaneous tests on linear regression coefficients in high-dimensional settings. When the dimensionality is larger than the sample size, the classic $F$-test is not applicable since the sample covariance matrix is not invertible. Recently, [5] and [17] proposed testing procedures by excluding the inverse term in $F$-statistics. However, the efficiency of such $F$-statistic-based methods is adversely affected by outlying observations and heavy tailed distributions. To overcome this issue, we propose a robust score test based on rank regression. The asymptotic distributions of the proposed test statistic under the high-dimensional null and alternative hypotheses are established. Its asymptotic relative efficiency with respect to [17]’s test is closely related to that of the Wilcoxon test in comparison with the $t$-test. Simulation studies are conducted to compare the proposed procedure with other existing testing procedures and show that our procedure is generally more robust in both sizes and powers.

Highlights

  • With the development of technology, high dimensional data was generated in many areas, such as hyperspectral imagery, internet portals, microarray analysis and finance

  • A frequently encountered challenge in high-dimensional regression is the detection of relevant variables

  • The main challenge of high-dimensional data is that the dimension p is much larger than the sample sizes n

Read more

Summary

Introduction

With the development of technology, high dimensional data was generated in many areas, such as hyperspectral imagery, internet portals, microarray analysis and finance. The main challenge of high-dimensional data is that the dimension p is much larger than the sample sizes n When this happens, many traditional statistical methods and theories may not necessarily work since they assume that p keeps unchanged as n increases. Their statistical properties, designed to perform “best” under the normality assumption, could potentially be (highly) affected when the errors are far away from normal or the data contain some outliers The R code for implementing the proposed procedure is given in a supplemental file

Test statistics
Asymptotic property
Simulation
A real-data application
Proof of Theorem 1
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.