Abstract
Feature selection is an important topic in high-dimensional statistics and machine learning, for prediction and understanding the underlying phenomena. It has many applications in computer vision, natural language processing, bioinformatics, etc. However, most feature selection methods in the literature have been proposed for offline learning, and the existing online feature selection methods have theoretical and practical limitations in true support recovery. This paper proposes two novel online feature selection methods by stochastic gradient descent with a hard thresholding operator. The proposed methods can simultaneously select the relevant features and build linear regression or classification models based on the selected variables. The theoretical justification is provided for the consistency of the proposed methods. Numerical experiments on simulated and real sparse datasets show that the proposed methods compare favourably with state-of-the-art online methods from the literature.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have