Abstract

ABSTRACTFeature screening procedures aim to reducing the dimensionality of data with exponentially-growing dimensions. Existing procedures all focused on a single type of predictors, which are either all continuous or all discrete. They cannot address mixed types of variables, outliers, or nonlinear trends. In this paper we first propose new feature screening procedure(s) for different continuous/discrete combinations of response and predictor variables. They are respectively based on marginal Spearman correlation, marginal ANOVA test, marginal Kruskal-Wallis test, Kolmogorov-Smirnov test, Mann-Whitney test, and smoothing splines modeling. Extensive simulation studies are performed to compare the new and existing procedures, with the aim of identifying a best robust screening procedure for each single type of data. Then we combine these best screening procedures to form the robust feature screening procedure for mixed type of data. We demonstrate its robustness against outliers and model misspecification through simulation studies and a real example.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call