Abstract
Variable selection is a procedure to obtain truly important predictors from inputs. Complex nonlinear dependencies and strong coupling pose great challenges for variable selection in high-dimensional data. Real-world applications have increased the demand for interpretable selection processes. A pragmatic approach should not only yield the most predictive covariates but also provide ample and easy-to-understand reasons for removing certain covariates. In view of these requirements, this paper proposes an approach for transparent and nonlinear variable selection. To transparently decouple information within the input predictors, a three-step heuristic search is designed, by which the input predictors are grouped into four subsets: relevant predictors, which are selected, and uninformative, redundant, and conditionally independent predictors, which are removed. A nonlinear partial correlation coefficient is introduced to better identify the predictors that have nonlinear functional dependence with the response. The selected subset is competent input for commonly used predictive models. Superiority of the proposed method is demonstrated against state-of-the-art baselines in terms of predictive accuracy and model interpretability.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.