Abstract

Abstract With the dramatic development of data collection and storage techniques, we often encounter massive high-dimensional data sets which contain outliers and heavy-tailed errors. Recently, the regularized Huber regression has been extensively developed to deal with such complex data sets. Although there are dozens of papers devoted to developing efficient solvers for the regularized Huber regression, it remains challenging when the number of features is extremely large. In this paper, we propose safe feature screening rules for the regularized Huber regression based on duality theory. These rules can remarkably accelerate the existing solvers for the regularized Huber regression by quickly reducing the number of features. To be specific, the proposed safe feature screening rules enable to identify and eliminate inactive features before starting the solver, then the computational effort can be saved significantly. Moreover, the proposed screening rules are safe in theory and practice. Finally, the experimental results on both synthetic and real data sets illustrate that the proposed screening rules can accelerate the speed of solving the regularized Huber regression and maintain its accuracy. In particular, when the number of features is large, the speedup obtained by our rules can be orders of magnitude.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call