Abstract

A novel model-free interaction screening approach called the hybrid metrics is introduced for high-dimensional heterogeneous data analysis. The metrics established based on the variation of conditional joint distribution function are measurements of interaction that include both size and direction. They are robust and can work with many types of response variables, including continuous, discrete, and categorical variables. We can apply the hybrid metrics to effective interaction selection for classification, response index models, and Poisson regression, among others. When dealing with classification, the hybrid metrics are capable of capturing both nonlinear category-general and category-specific interaction effects, providing us with a comprehensive overview and precise discovery of category information. When faced with a continuous response, the hybrid metrics perform fairly well even if the signal strength is weak, behaving as if the true interactions were known. To facilitate implementation, a fast two-stage procedure which naturally and efficiently enforces both strong and weak heredity is advocated. We further demonstrate their superior performances over popular competitors by exhaustive simulations and a SRBCT real data example. Supplementary materials for this article are available online.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call