C-lasso - a Python package for constrained sparse and robust regression and classification

Léo Simpson,Patrick Combettes,Christian Müller

doi:10.21105/joss.02844

Léo Simpson, Patrick Combettes + Show 1 more

Open Access

https://doi.org/10.21105/joss.02844

Copy DOI

Abstract

We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints. The underlying statistical forward model is assumed to be of the following form: \[ y = X \beta + \sigma \epsilon \qquad \textrm{subject to} \qquad C\beta=0 \] Here, $X \in \mathbb{R}^{n\times d}$is a given design matrix and the vector $y \in \mathbb{R}^{n}$ is a continuous or binary response vector. The matrix $C$ is a general constraint matrix. The vector $\beta \in \mathbb{R}^{d}$ contains the unknown coefficients and $\sigma$ an unknown scale. Prominent use cases are (sparse) log-contrast regression with compositional data $X$, requiring the constraint $1_d^T \beta = 0$ (Aitchion and Bacon-Shone 1984) and the Generalized Lasso which is a special case of the described problem (see, e.g, (James, Paulson, and Rusmevichientong 2020), Example 3). The c-lasso package provides estimators for inferring unknown coefficients and scale (i.e., perspective M-estimators (Combettes and Muller 2020a)) of the form \[ \min_{\beta \in \mathbb{R}^d, \sigma \in \mathbb{R}_{0}} f\left(X\beta - y,{\sigma} \right) + \lambda \left\lVert \beta\right\rVert_1 \qquad \textrm{subject to} \qquad C\beta = 0 \] for several convex loss functions $f(\cdot,\cdot)$. This includes the constrained Lasso, the constrained scaled Lasso, and sparse Huber M-estimators with linear equality constraints.

Highlights

X ∈ Rn×d is a given design matrix and the vector y ∈ Rn is a continuous or binary response vector
We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints
The underlying statistical forward model is assumed to be of the following form: y = Xβ + σε subject to Cβ = 0

Summary

We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints. ∥β∥1 subject to Cβ = 0 for several convex loss functions f (·, ·) This includes the constrained Lasso, the constrained scaled Lasso, sparse Huber M-estimators with linear equality constraints, and constrained (Huberized) Square Hinge Support Vector Machines (SVMs) for classification. # Formulation R1 problem.formulation.huber = False problem.formulation.concomitant = False problem.formulation.classification = False This regression problem uses the Huber loss hρ as objective function for robust model fitting with an L1 penalty and linear equality constraints on the β vector. R4 Constrained sparse Huber regression with concomitant scale estimation: This formulation combines R2 and R3 allowing robust joint estimation of the (constrained) β vector and the scale σ in a concomitant fashion (Combettes & Müller, 2020a, 2020b).

Constrained sparse classification with Square Hinge loss:

Constrained sparse classification with Huberized Square Hinge loss:

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Open Source Software	Publication Date: Jan 17, 2021
Citations: 10	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

C-lasso - a Python package for constrained sparse and robust regression and classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Open Source Software

Lead the way for us

Similar Papers

Robust and sparse regression in generalized linear model by stochastic optimization
Takayuki Kawashima ... Hironori Fujisawa
Japanese Journal of Statistics and Data Science | VOL. 2
Takayuki Kawashima, et. al.Takayuki Kawashima ... Hironori Fujisawa
11 Jun 2019
Japanese Journal of Statistics and Data Science | VOL. 2

LSEQIEQ: a FORTRAN IV subroutine package for the analysis of multiple linear regression problems with possibly deficient pseudorank and linear equality and inequality constraints
Mark S Ghiorso
Computers and Geosciences | VOL. 9
Mark S GhiorsoMark S Ghiorso
01 Jan 1982
Computers and Geosciences | VOL. 9

Extensions of ADMM for separable convex optimization problems with linear equality or inequality constraints
Bingsheng He ... Shengjie Xu
-
Bingsheng He, et. al.Bingsheng He ... Shengjie Xu
01 Jan 2023
01 Jan 2023

Regularized Lagrangian duality for linearly constrained quadratic optimization and trust-region problems
V Jeyakumar ... Guoyin Li
Journal of Global Optimization | VOL. 49
V Jeyakumar, et. al.V Jeyakumar ... Guoyin Li
08 Jan 2010
Journal of Global Optimization | VOL. 49

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

C-lasso - a Python package for constrained sparse and robust regression and classification

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Open Source Software