Abstract

AbstractIdentifying significant features (SFs) is important because they are driving factors of a target outcome. However, it is difficult when they have much more features than observations. The problem becomes more challenging when there are multicollinearity and infrequent common features. In such case, standard explainable methods such as OLS and Lasso often fail to identify many SF. To tackle these problems, we propose a stable model to maximize the number of SFs using selective inference called SFLasso-SI. First, in each point in the regularization path, SFLasso-SI conducts selective inference for conservative significance test. Then, it chooses the optimum value of regularization that maximizes the number of SFs. Our extensive experiments across different types of data - text, image, and video show that our SFLasso-SI can find the biggest number of SFs while maintaining similar prediction accuracy as the benchmarking methods.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.