Trimmed Statistical Estimation via Variance Reduction

Aleksandr Aravkin,Damek Davis

doi:10.1287/moor.2019.0992

Abstract

In this paper, we show how to transform any optimization problem that arises from fitting a machine learning model into one that (1) detects and removes contaminated data from the training set while (2) simultaneously fitting the trimmed model on the uncontaminated data that remains. To solve the resulting nonconvex optimization problem, we introduce a fast stochastic proximal-gradient algorithm that incorporates prior knowledge through nonsmooth regularization. For data sets of size n, our approach requires O(n2/3/ℇ) gradient evaluations to reach ℇ-accuracy, and when a certain error bound holds, the complexity improves to O(κn2/3 log(1/ℇ)), where κ is a “condition number.” These rates are n1/3 times better than those achieved by typical, nonstochastic methods.

Full Text