A general family of trimmed estimators for robust high-dimensional data analysis

Eunho Yang,Aurélie C Lozano,Aleksandr Aravkin

doi:10.1214/18-ejs1470

Abstract

We consider the problem of robustifying high-dimensional structured estimation. Robust techniques are key in real-world applications which often involve outliers and data corruption. We focus on trimmed versions of structurally regularized M-estimators in the high-dimensional setting, including the popular Least Trimmed Squares estimator, as well as analogous estimators for generalized linear models and graphical models, using possibly non-convex loss functions. We present a general analysis of their statistical convergence rates and consistency, and then take a closer look at the trimmed versions of the Lasso and Graphical Lasso estimators as special cases. On the optimization side, we show how to extend algorithms for M-estimators to fit trimmed variants and provide guarantees on their numerical convergence. The generality and competitive performance of high-dimensional trimmed estimators are illustrated numerically on both simulated and real-world genomics data.

Highlights

We consider the problem of high-dimensional estimation, where the number of variables p may greatly exceed the number of observations n
For matrix-structured regression problems, estimators using nuclear-norm regularization have been studied e.g. by Recht et al (2010). Another prime example is that of sparse inverse covariance estimation for graphical model selection (Ravikumar et al 2011)
We focus on Gaussian graphical models and provide the statistical guarantees of our Trimmed Graphical Lasso estimator as presented in Section 2 (Motivating Example 2)

Summary

Introduction

We consider the problem of high-dimensional estimation, where the number of variables p may greatly exceed the number of observations n. The development and the statistical analysis of structurally constrained estimators for high-dimensional estimation has recently attracted considerable attention These estimators seek to minimize the sum of a loss function and a weighted regularizer. The desirable theoretical properties of such regularized M-estimators can be compromised, since outliers and corruptions are often present in high-dimensional data problems These challenges motivate the development of robust structured learning methods that can cope with observations deviating from the model assumptions. The median of least squares residual originally proposed by Rousseeuw (1984) avoids this problem, reaching breakdown point of nearly 50%; the approach is equivalent to ‘trimming’ a portion of the largest residuals This lead to the consideration of sparse Least Trimmed Squares (sparse LTS) for robust high-dimensional estimation.

A General Framework for High-Dimensional Trimmed Estimators

Statistical Guarantees of Trimmed Estimators

Statistical Guarantees of High-Dimensional Least Trimmed Squares

Statistical Guarantees of Trimmed Graphical Lasso

Optimization for Trimmed Estimators

Simulated Data Experiments

Simulations for Sparse Logistic Regression

Simulations for Trace-Norm Regularized Regression

Simulations for Gaussian Graphical Models

Analysis of Yeast Genotype and Expression data

Method

Application to the analysis of Yeast Gene Expression Data

Concluding Remarks

A Proof of Theorem 1

C Results for Trimmed Graphical Lasso

Proof of Corollary 3

Proof of Corollary 5

D Proof of Proposition 1

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Electronic Journal of Statistics	Publication Date: Jan 1, 2018
Citations: 30	License type: cc-by

R Discovery Prime

R Discovery Prime

A general family of trimmed estimators for robust high-dimensional data analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics

Lead the way for us

Similar Papers

On principal graphical models with application to gene network
Kyongwon Kim
Computational Statistics & Data Analysis | VOL. 166
Kyongwon KimKyongwon Kim
08 Sep 2021
Computational Statistics & Data Analysis | VOL. 166

On the Efficiency of the Hybrid Multivariate Least Trimmed Squares and Residual Autocovariance as a Robust Estimator for a Multivariate Autoregressive Model
...
-
, et. al. ...
14 Apr 2015
14 Apr 2015

Least trimmed squares estimator with redundancy constraint for outlier detection in GNSS networks
Ismael É Koch ... Luiz Gonzaga
Expert Systems with Applications | VOL. 88
Ismael É Koch, et. al.Ismael É Koch ... Luiz Gonzaga
10 Jul 2017
Expert Systems with Applications | VOL. 88

An Equivariant and Robust Estimator in Multivariate Regression Based on Least Trimmed Squares
Kang-Mo Jung
-
Kang-Mo JungKang-Mo Jung
01 Dec 2003
01 Dec 2003

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A general family of trimmed estimators for robust high-dimensional data analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Electronic Journal of Statistics