Robust regression plays an important role in many machine learning problems. A primal approach relies on the use of Huber loss and an iteratively reweighted l2 method. However, because the Huber loss is not smooth and its corresponding distribution cannot be represented as a Gaussian scale mixture, such an approach is extremely difficult to handle using a probabilistic framework. To address those limitations, this paper proposes two novel losses and the corresponding probability functions. One is called Soft Huber, which is well suited for modeling non-Gaussian noise. Another is Nonconvex Huber, which can help produce much sparser results when imposed as a prior on regression vector. They can represent any lq loss ($${1 \over 2}$$ ⩽ q < 2) with tuning parameters, which makes the regression model more robust. We also show that both distributions have an elegant form, which is a Gaussian scale mixture with a generalized inverse Gaussian mixing density. This enables us to devise an expectation maximization (EM) algorithm for solving the regression model. We can obtain an adaptive weight through EM, which is very useful to remove noise data or irrelevant features in regression problems. We apply our model to the face recognition problem and show that it not only reduces the impact of noise pixels but also removes more irrelevant face images. Our experiments demonstrate the promising results on two datasets.
Read full abstract