Abstract

We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints. The underlying statistical forward model is assumed to be of the following form: \[ y = X \beta + \sigma \epsilon \qquad \textrm{subject to} \qquad C\beta=0 \] Here, $X \in \mathbb{R}^{n\times d}$is a given design matrix and the vector $y \in \mathbb{R}^{n}$ is a continuous or binary response vector. The matrix $C$ is a general constraint matrix. The vector $\beta \in \mathbb{R}^{d}$ contains the unknown coefficients and $\sigma$ an unknown scale. Prominent use cases are (sparse) log-contrast regression with compositional data $X$, requiring the constraint $1_d^T \beta = 0$ (Aitchion and Bacon-Shone 1984) and the Generalized Lasso which is a special case of the described problem (see, e.g, (James, Paulson, and Rusmevichientong 2020), Example 3). The c-lasso package provides estimators for inferring unknown coefficients and scale (i.e., perspective M-estimators (Combettes and Muller 2020a)) of the form \[ \min_{\beta \in \mathbb{R}^d, \sigma \in \mathbb{R}_{0}} f\left(X\beta - y,{\sigma} \right) + \lambda \left\lVert \beta\right\rVert_1 \qquad \textrm{subject to} \qquad C\beta = 0 \] for several convex loss functions $f(\cdot,\cdot)$. This includes the constrained Lasso, the constrained scaled Lasso, and sparse Huber M-estimators with linear equality constraints.

Highlights

  • X ∈ Rn×d is a given design matrix and the vector y ∈ Rn is a continuous or binary response vector

  • We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints

  • The underlying statistical forward model is assumed to be of the following form: y = Xβ + σε subject to Cβ = 0

Read more

Summary

Summary

We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints. ∥β∥1 subject to Cβ = 0 for several convex loss functions f (·, ·) This includes the constrained Lasso, the constrained scaled Lasso, sparse Huber M-estimators with linear equality constraints, and constrained (Huberized) Square Hinge Support Vector Machines (SVMs) for classification. # Formulation R1 problem.formulation.huber = False problem.formulation.concomitant = False problem.formulation.classification = False This regression problem uses the Huber loss hρ as objective function for robust model fitting with an L1 penalty and linear equality constraints on the β vector. R4 Constrained sparse Huber regression with concomitant scale estimation: This formulation combines R2 and R3 allowing robust joint estimation of the (constrained) β vector and the scale σ in a concomitant fashion (Combettes & Müller, 2020a, 2020b).

Constrained sparse classification with Square Hinge loss:
Constrained sparse classification with Huberized Square Hinge loss:

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.