Abstract

Feature selection is one of the most important issues in supervised learning and there are a lot of different feature selection approaches in the literature. Among them one recent approach is to use Gaussian process (GP) because it can capture well the hidden relevance between the features of the input and the output. However, the existing feature selection approaches with GP suffer from the scalability problem due to high computational cost of inference with GP. Moreover, they use the Kullback–Leibler (KL) divergence in the sensitivity analysis for feature selection, but we show in this paper that the KL divergence underestimates the relevance of important features in some cases of classification.To remedy such drawbacks of the existing GP based approaches, we propose a new feature selection method with scalable variational Gaussian process (SVGP) and L2 divergence. With the help of SVGP the proposed method exploits given large data sets well for feature selection through so-called inducing points while avoiding the scalability problem. Moreover, we provide theoretical analysis to motivate the choice of L2 divergence for feature selection in both classification and regression. To validate the performance of the proposed method, we compare it with other existing methods through experiments with synthetic and real data sets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.