Abstract

For many practical high-dimensional problems, interactions have been increasingly found to play important roles beyond main effects. A representative example is gene-gene interaction. Joint analysis, which analyzes all interactions and main effects in a single model, can be seriously challenged by high dimensionality. For high-dimensional data analysis in general, marginal screening has been established as effective for reducing computational cost, increasing stability, and improving estimation/selection performance. Most of the existing marginal screening methods are designed for the analysis of main effects only. The existing screening methods for interaction analysis are often limited by making stringent model assumptions, lacking robustness, and/or requiring predictors to be continuous (and hence lacking flexibility). A unified marginal screening approach tailored to interaction analysis is developed, which can be applied to regression, classification, and survival analysis. Predictors are allowed to be continuous and discrete. The proposed approach is built on Coefficient of Variation (CV) filters based on information entropy. Statistical properties are rigorously established. It is shown that the CV filters are almost insensitive to the distribution tails of predictors, correlation structure among predictors, and sparsity level of signals. An efficient two-stage algorithm is developed to make the proposed approach scalable to ultrahigh-dimensional data. Simulations and the analysis of TCGA LUAD data further establish the practical superiority of the proposed approach.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.