Abstract
This paper is concerned with a conditional test for regression coefficients in ultrahigh dimensional linear models. Conditioning on a subset of important predictors in the model, we test the overall significance of regression coefficients of the remaining ultrahigh dimensional predictors. We first propose a conditional U-statistic test (CUT) based on an estimated U-statistic for a high dimensional linear regression model and prove that its null asymptotic distribution is normal under some mild assumptions. However, the empirical power of the proposed test is inversely affected by the dimensionality of predictors. To this end, we further propose a two-stage CUT with screening (CUTS) procedure to reduce the dimensionality under the sparsity assumption and enhance the empirical power based on random data splitting strategy. In the first stage, we divide data randomly into two equal halves and apply the conditional sure independence screening to the first half to reduce the dimensionality; In the second stage, we apply the proposed CUT test to the second half. To eliminate the effect of random data splitting and further enhance the empirical power, we also develop a powerful ensemble algorithm based on multiple splitting strategy and prove that the family-wise error rate is asymptotically controlled at a given significance level. We demonstrate the excellent finite-sample performances of the proposed conditional test via Monte Carlo simulations and a real data analysis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.