Recent demands from big data applications have strongly motivated a successful sparse formulation of the least squares support vector regression (LSSVR) model in primal weight space. Such an approach, which has been called fixed-size LSSVR (FS-LSSVR), is built upon an approximation of the nonlinear feature mapping based on Nyström method to overcome the usual memory constraints and high computational costs of the standard non-sparse LSSVR model. Despite this advance, an important modeling issue remains unaddressed by the FS-LSSVR model. As in the standard LSSVR model, the performance of FS-LSSVR model is greatly degraded when estimation data is corrupted with non-Gaussian noise or outliers. Bearing this major issue in mind, we introduce two robust variants of the FS-LSSVR model based on M-estimation framework and the weighted least squares method. The proposed approaches, henceforth called Robust FS-LSSVR (RFS-LSSVR) and Reweighted Robust FS-LSSVR (R2FS-LSSVR) models, produce solutions that are simultaneously robust to outliers and sparse, making use of only a small sample of training patterns as prototype vectors. We evaluate the performances of both algorithms in benchmarking nonlinear system identification problems with synthetic and real-world datasets (including a large scale dataset) corresponding to SISO and MIMO systems, whose estimation outputs are contaminated with outliers. The obtained results indicate that our proposed approaches consistently outperforms existing robust models designed in dual space (e.g. W-LSSVR and IR-LSSVR models), specially as the amount of outliers in the data increases.
Read full abstract