Fast Variable Selection by Block Addition and Block Deletion

Takashi Nagatani,Shigeo Abe,Seiichi Ozawa

doi:10.4236/jilsa.2010.24023

Takashi Nagatani, Shigeo Abe + Show 1 more

Open Access

https://doi.org/10.4236/jilsa.2010.24023

Copy DOI

Abstract

We propose the threshold updating method for terminating variable selection and two variable selection methods. In the threshold updating method, we update the threshold value when the approximation error smaller than the current threshold value is obtained. The first variable selection method is the combination of forward selection by block addition and backward selection by block deletion. In this method, starting from the empty set of the input variables, we add several input variables at a time until the approximation error is below the threshold value. Then we search deletable variables by block deletion. The second method is the combination of the first method and variable selection by Linear Programming Support Vector Regressors (LPSVRs). By training an LPSVR with linear kernels, we evaluate the weights of the decision function and delete the input variables whose associated absolute weights are zero. Then we carry out block addition and block deletion. By computer experiments using benchmark data sets, we show that the proposed methods can perform faster variable selection than the method only using block deletion, and that by the threshold updating method, the approximation error is lower than that by the fixed threshold method. We also compare our method with an imbedded method, which determines the optimal variables during training, and show that our method gives comparable or better variable selection performance.

Highlights

Function approximation estimates a continuous value for the given inputs based on the relationship acquired from a set of input-output pairs
Support Vector Machines (SVMs) are developed for pattern recognition, they are extended to solving function approximation problems such as Support Vector Regressors (SVRs) [3], Least Squares Support Vector Regressors (LSSVRs) [4], and Linear Programming Support Vector Regressors (LPSVRs) [5]
The columns “Before” and “After” list the approximation errors before and after variable selection; the column “Vali.” lists the approximation errors evaluated by cross-validation; “Test” lists the approximation errors for the test data sets; The column “Selected” lists the input variables selected by variable selection; The results in the column “Before” are the same for the six variable selection methods, so we list the results only in the first row among the six rows

Summary

Introduction

Function approximation estimates a continuous value for the given inputs based on the relationship acquired from a set of input-output pairs. SVMs are developed for pattern recognition, they are extended to solving function approximation problems such as Support Vector Regressors (SVRs) [3], Least Squares Support Vector Regressors (LSSVRs) [4], and Linear Programming Support Vector Regressors (LPSVRs) [5]. We may encounter problems such as the high computational cost caused by a large number of input variables and deterioration of the generalization ability by redundant input variables. According to the selection criterion used, the variable selection methods are classified into wrapper methods and filter methods. The computational cost of the filter methods may be small, it will take a risk of selecting a subset of input variables that may deteriorate the generalization ability of the regressor

Methods

Results

Conclusion