c-lasso - a Python package for constrained sparse and robust regression and classification

@article{Simpson2021classoA,
  title={c-lasso - a Python package for constrained sparse and robust regression and classification},
  author={L{\'e}o Simpson and Patrick L. Combettes and Christian L. M{\"u}ller},
  journal={J. Open Source Softw.},
  year={2021},
  volume={6},
  pages={2844}
}
We introduce c-lasso, a Python package that enables sparse and robust linear regression and classification with linear equality constraints. The underlying statistical forward model is assumed to be of the following form: \[ y = X \beta + \sigma \epsilon \qquad \textrm{subject to} \qquad C\beta=0 \] Here, $X \in \mathbb{R}^{n\times d}$is a given design matrix and the vector $y \in \mathbb{R}^{n}$ is a continuous or binary response vector. The matrix $C$ is a general constraint matrix. The… 

Figures from this paper

Supervised Learning and Model Analysis with Compositional Data
TLDR
KernelBiome is a kernel-based nonparametric regression and classification framework for compositional data that captures complex signals, including in the zero-structure, while automatically adapting model complexity and is able to incorporate prior knowledge, such as phylogenetic structure.
Bayesian Knockoff Generators for Robust Inference Under Complex Data Structure
TLDR
This work proposes Bayesian models for generating high quality knockoff copies that utilize available knowledge about the data structure, thus improving the resolution of prognostic features.
CR-Sparse: Hardware accelerated functional algorithms for sparse signal processing in Python using JAX
We introduce CR-Sparse, a Python library that enables to efficiently solve a wide variety of sparse representation based signal processing problems. It is a cohesive collection of sublibraries
A causal view on compositional data
TLDR
This work provides a causal view on compositional data in an instrumental variable setting where the composition acts as the cause and advocates for multivariate alternatives using statistical data transformations and regression techniques that take the special structure of the compositional sample space into account.
Tree-aggregated predictive modeling of microbiome data
TLDR
A data-driven, parameter-free, and scalable tree-guided aggregation framework to associate microbial subcompositions with response variables of interest and posit that the inferred aggregation levels provide highly interpretable taxon groupings that can help microbial ecologists gain insights into the structure and functioning of the underlying ecosystem of interest.

References

SHOWING 1-10 OF 22 REFERENCES
Algorithms for Fitting the Constrained Lasso
  • Brian R. Gaines, Juhyun Kim, Hua Zhou
  • Computer Science
    Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America
  • 2018
TLDR
This work employs the alternating direction method of multipliers (ADMM) and also derive an efficient solution path algorithm for solving the constrained lasso problem, and shows that, for an arbitrary penalty matrix, the generalized lasso can be transformed to a constrainedLasso, while the converse is not true.
Piecewise linear regularized solution paths
We consider the generic regularized optimization problem β(λ) = argminβ L(y, Xβ) + λJ(β). Efron, Hastie, Johnstone and Tibshirani [Ann. Statist. 32 (2004) 407-499] have shown that for the LASSO-that
Regression Models for Compositional Data: General Log-Contrast Formulations, Proximal Optimization, and Microbiome Data Applications
TLDR
A general convex optimization model for linear log-contrast regression which includes many previous proposals as special cases is proposed and a proximal algorithm is introduced that solves the resulting constrained optimization problem exactly with rigorous convergence guarantees.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition
TLDR
This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering.
Perspective maximum likelihood-type estimation via proximal decomposition
We introduce an optimization model for maximum likelihood-type estimation (M-estimation) that generalizes a large class of existing statistical models, including Huber's concomitant M-estimator,
Penalized and Constrained Optimization: An Application to High-Dimensional Website Advertising
TLDR
The Penalized and Constrained optimization method (PaC) is developed to compute the solution path for high-dimensional, linearly constrained criteria and is applied to a proprietary dataset in an exemplar Internet advertising case study and demonstrates its superiority over existing methods in this practical setting.
Variable selection in regression with compositional covariates
TLDR
An l1 regularization method for the linear log-contrast model that respects the unique features of compositional data is proposed and its usefulness is illustrated by an application to a microbiome study relating human body mass index to gut microbiome composition.
Regression Analysis for Microbiome Compositional Data
One important problem in microbiome analysis is to identify the bacterial taxa that are associated with a response, where the microbiome data are summarized as the composition of the bacterial taxa
Robust regression with compositional covariates
The Elements of Statistical Learning: Data Mining, Inference, and Prediction
TLDR
This book is a valuable resource, both for the statistician needing an introduction to machine learning and related Ž elds and for the computer scientist wishing to learn more about statistics, and statisticians will especially appreciate that it is written in their own language.
...
...