The Selectively Adaptive Lasso

  title={The Selectively Adaptive Lasso},
  author={Alejandro Schuler and Mark J. van der Laan},
Machine learning regression methods allow estimation of functions without unrealistic parametric assumptions. Although they can perform exceptionally in prediction error, most lack theoretical convergence rates necessary for semi-parametric efficient estimation (e.g. TMLE, AIPW) of parameters like average treatment effects. The Highly Adaptive Lasso (HAL) is the only regression method proven to converge quickly enough for a meaningfully large class of functions, independent of the dimensionality… 

Tables from this paper


The Highly Adaptive Lasso Estimator
  • D. Benkeser, M. J. Laan
  • Computer Science, Mathematics
    2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)
  • 2016
A novel nonparametric regression estimator is proposed that, in contrast to many existing methods, does not rely on local smoothness assumptions nor is it constructed using local smoothing techniques, and respects global smoothness constraints by virtue of falling in a class of right-hand continuous functions with left-hand limits that have variation norm bounded by a constant.
hal9001: Scalable highly adaptive lasso regression in R
The hal9001 R package provides a computationally efficient implementation of the highly adaptive lasso (HAL), a flexible nonparametric regression and machine learning algorithm endowed with several
Boosted Lasso
The Boosted Lasso (BLasso) algorithm is proposed, which ties the Boosting algorithm with the Lasso method and is extended to minimizing a general convex loss penalized by ageneral convex function.
Efficient Estimation of Pathwise Differentiable Target Parameters with the Undersmoothed Highly Adaptive Lasso.
It is established that this Spline-HAL-MLE yields an asymptotically efficient estimator of any smooth feature of the functional parameter under an easily verifiable global undersmoothing condition.
Boosting with early stopping: Convergence and consistency
This paper studies numerical convergence, consistency and statistical rates of convergence of boosting with early stopping, when it is carried out over the linear span of a family of basis functions, and leads to a rigorous proof that for a linearly separable problem, AdaBoost becomes an L 1 -margin maximizer when left to run to convergence.
Regression Shrinkage and Selection via the Lasso
A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Greedy function approximation: A gradient boosting machine.
A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.
Why Machine Learning Cannot Ignore Maximum Likelihood Estimation
It is asserted that one essential idea is for machine learning to integrate maximum likelihood for estimation of functional parameters, such as prediction functions and conditional densities.
Higher order influence functions and minimax estimation of nonlinear functionals
We present a theory of point and interval estimation for nonlinear functionals in parametric, semi-, and non-parametric models based on higher order influence functions (Robins (2004), Section 9; Li
Regularization Paths for Generalized Linear Models via Coordinate Descent.
In comparative timings, the new algorithms are considerably faster than competing methods and can handle large problems and can also deal efficiently with sparse features.