Abstract
In high-dimensional data, many sparse regression methods have been proposed. However, they may not be robust against outliers. Recently, the use of density power weight has been studied for robust parameter estimation, and the corresponding divergences have been discussed. One such divergence is the γ -divergence, and the robust estimator using the γ -divergence is known for having a strong robustness. In this paper, we extend the γ -divergence to the regression problem, consider the robust and sparse regression based on the γ -divergence and show that it has a strong robustness under heavy contamination even when outliers are heterogeneous. The loss function is constructed by an empirical estimate of the γ -divergence with sparse regularization, and the parameter estimate is defined as the minimizer of the loss function. To obtain the robust and sparse estimate, we propose an efficient update algorithm, which has a monotone decreasing property of the loss function. Particularly, we discuss a linear regression problem with L 1 regularization in detail. In numerical experiments and real data analyses, we see that the proposed method outperforms past robust and sparse methods.
Highlights
In high-dimensional data, sparse regression methods have been intensively studied
The γ-divergence proposed by Fujisawa and Eguchi [9] is known for having a strong robustness, which implies that the latent bias can be sufficiently small even under heavy contamination
We propose the robust and sparse regression problem based on the γ-divergence
Summary
In high-dimensional data, sparse regression methods have been intensively studied. The Lasso [1]. Is a typical sparse linear regression method with L1 regularization, but is not robust against outliers. Robust and sparse linear regression methods have been proposed. The sparse least trimmed squares (sLTS) [4] is a sparse version of the well-known robust linear regression method LTS [5] based on the trimmed loss function with L1 regularization. We consider a loss function based on the γ-divergence with sparse regularization and propose an update algorithm to obtain the robust and sparse estimate. Fujisawa and Eguchi [9] used a Pythagorean relation on the γ-divergence, but it is not compatible with sparse regularization Instead of this relation, we use the majorization-minimization algorithm [14]. The R language software package “gamreg”, which we use to implement our proposed method, can be downloaded at http://cran.r-project.org/web/packages/gamreg/
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.