Abstract

How to select the active variables that have significant impact on the event of interest is a very important and meaningful problem in the statistical analysis of ultrahigh-dimensional data. In many applications, researchers often know that a certain set of covariates are active variables from some previous investigations and experiences. With the knowledge of the important prior knowledge of active variables, we propose a model-free conditional screening procedure for ultrahigh dimensional survival data based on conditional distance correlation. The proposed procedure can effectively detect the hidden active variables that are jointly important but are weakly correlated with the response. Moreover, it performs well when covariates are strongly correlated with each other. We establish the sure screening property and the ranking consistency of the proposed method and conduct extensive simulation studies, which suggests that the proposed procedure works well for practical situations. Then, we illustrate the new approach through a real dataset from the diffuse large-B-cell lymphomastudy S1.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call