# The Selectively Adaptive Lasso

@article{Schuler2022TheSA, title={The Selectively Adaptive Lasso}, author={Alejandro Schuler and Mark J. van der Laan}, journal={ArXiv}, year={2022}, volume={abs/2205.10697} }

Machine learning regression methods allow estimation of functions without unrealistic parametric assumptions. Although they can perform exceptionally in prediction error, most lack theoretical convergence rates necessary for semi-parametric eﬃcient estimation (e.g. TMLE, AIPW) of parameters like average treatment eﬀects. The Highly Adaptive Lasso (HAL) is the only regression method proven to converge quickly enough for a meaningfully large class of functions, independent of the dimensionality…

## Tables from this paper

## References

SHOWING 1-10 OF 27 REFERENCES

The Highly Adaptive Lasso Estimator

- Computer Science, Mathematics2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)
- 2016

A novel nonparametric regression estimator is proposed that, in contrast to many existing methods, does not rely on local smoothness assumptions nor is it constructed using local smoothing techniques, and respects global smoothness constraints by virtue of falling in a class of right-hand continuous functions with left-hand limits that have variation norm bounded by a constant.

hal9001: Scalable highly adaptive lasso regression in R

- Computer ScienceJ. Open Source Softw.
- 2020

The hal9001 R package provides a computationally efficient implementation of the highly adaptive lasso (HAL), a flexible nonparametric regression and machine learning algorithm endowed with several…

Boosted Lasso

- Computer Science
- 2004

The Boosted Lasso (BLasso) algorithm is proposed, which ties the Boosting algorithm with the Lasso method and is extended to minimizing a general convex loss penalized by ageneral convex function.

Efficient Estimation of Pathwise Differentiable Target Parameters with the Undersmoothed Highly Adaptive Lasso.

- Mathematics
- 2019

It is established that this Spline-HAL-MLE yields an asymptotically efficient estimator of any smooth feature of the functional parameter under an easily verifiable global undersmoothing condition.

Boosting with early stopping: Convergence and consistency

- Computer Science
- 2005

This paper studies numerical convergence, consistency and statistical rates of convergence of boosting with early stopping, when it is carried out over the linear span of a family of basis functions, and leads to a rigorous proof that for a linearly separable problem, AdaBoost becomes an L 1 -margin maximizer when left to run to convergence.

Regression Shrinkage and Selection via the Lasso

- Computer Science
- 1996

A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.

Greedy function approximation: A gradient boosting machine.

- Computer Science
- 2001

A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.

Why Machine Learning Cannot Ignore Maximum Likelihood Estimation

- Computer ScienceArXiv
- 2021

It is asserted that one essential idea is for machine learning to integrate maximum likelihood for estimation of functional parameters, such as prediction functions and conditional densities.

Higher order influence functions and minimax estimation of nonlinear functionals

- Mathematics
- 2008

We present a theory of point and interval estimation for nonlinear functionals in parametric, semi-, and non-parametric models based on higher order influence functions (Robins (2004), Section 9; Li…

Regularization Paths for Generalized Linear Models via Coordinate Descent.

- Computer ScienceJournal of statistical software
- 2010

In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.