Abstract

Based on Hoeffding’s inequality, many popular regression and classification models in supervised learning relax the expected risk minimization problem to the empirical risk minimization problem. Nevertheless, the recent theoretical results disclose that the bound of Bernstein’s inequality - which includes variance information - is often significantly tighter than the bound of Hoeffding’s inequality. In this paper, based on the empirical Bernstein bound, we proposed a risk-averse learning machine, which can achieve better generalization performance by trading off good loss performance (approximation error) and small variance (estimation error) as well as suitable complexity of the model. We prove that the resulting learning machine is tractable for many popular loss functions. Moreover, to solve our model which is mostly non-convex because of the square root term, we introduce an extra variable to get rid of the square root and obtain an optimization problem that is convex in most cases for two variables respectively. Then Newton’s method can be used to solve it alternately. The experimental results on artificial and benchmark datasets demonstrate that the proposed models can achieve better performance compared to other existing empirical risk minimization models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call