Abstract
A novel feature selection algorithm that combines the ideas of linear support vector machines (SVMs) and random probes is proposed. A random probe is first artificially generated from a Gaussian distribution and appended to the data set as an extra input variable. Next, a standard 2-norm or 1-norm linear support vector machine is trained using this new data set. Each coefficient, or weight, in a linear SVM is compared to that of the random probe feature. Under several statistical assumptions, the probability of each input feature being more relevant than the random probe can be computed easily. The proposed feature selection method is intuitive to use in real-world problems, and it automatically determines the optimal number of features needed. It can also be extended to selecting significant interaction and/or quadratic terms in a 2nd-order polynomial representation
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have