# The Dantzig selector: Statistical estimation when P is much larger than n

@article{Cands2007TheDS,
title={The Dantzig selector: Statistical estimation when P is much larger than n},
author={Emmanuel J. Cand{\e}s and Terence Tao},
journal={Quality Engineering},
year={2007},
volume={54},
pages={83-84}
}`
• Published 5 June 2005
• Computer Science
• Quality Engineering
In many important statistical applications, the number of variables or parameters p is much larger than the number of observations n. Suppose then that we have observations y=Xβ+z, where β∈Rp is a parameter vector of interest, X is a data matrix with possibly far fewer rows than columns, n≪p, and the zi’s are i.i.d. N(0, σ^2). Is it possible to estimate β reliably based on the noisy data y?
3,363 Citations

## Figures and Tables from this paper

### Discussion: The Dantzig selector: Statistical estimation when p is much larger than n

The conditions of this paper using the Dantzig selector and those of Bunea, Tsybakov and Wegkamp using the Lasso are presented together, since these authors emphasize different points and use different normalizations.

### Thresholding Procedures for High Dimensional Variable Selection and Statistical Estimation

The multi-step thresholding procedure can accurately estimate a sparse vector β ∈ ℝp in a linear model, under the restricted eigenvalue conditions, and if X obeys a uniform uncertainty principle and if the true parameter is sufficiently sparse, the Gauss-Dantzig selector achieves the l2 loss within a logarithmic factor of the ideal mean square error.

### DISTRIBUTED TESTING AND ESTIMATION UNDER SPARSE HIGH DIMENSIONAL MODELS.

• Computer Science
Annals of statistics
• 2018
This paper addresses the important question of how large k can be, as n grows large, such that the loss of efficiency due to the divide-and-conquer algorithm is negligible.

### Statistical Estimation in High Dimension, Sparsity and Oracle Inequalities

This work studies the statistical properties of two types of procedures: the penalized risk minimization procedures with a penalty term on the set of potential parameters and the exponential weights procedures and establishes oracle inequalities for the L^{\pi}$norm,$1\leqslant \pi \leqSlant \infty\$.

### SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR1 BY PETER J. BICKEL, YA’ACOV RITOV

• Computer Science
• 2009
It is shown that, under a sparsity scenario, the Lasso estimator and the Dantzig selector exhibit similar behavior, and oracle inequalities for the prediction risk in the general nonparametric regression model and bounds on the p estimation loss for 1 ≤ p ≤ 2 in the linear model are derived.

### SIMULTANEOUS ANALYSIS OF LASSO AND DANTZIG SELECTOR

• Computer Science
• 2009
We show that, under a sparsity scenario, the Lasso estimator and the Dantzig selector exhibit similar behavior. For both methods, we derive, in parallel, oracle inequalities for the prediction risk

### Distributed Estimation and Inference with Statistical Guarantees

• Computer Science, Mathematics
• 2015
This paper addresses the important question of how to choose k as n grows large, providing a theoretical upper bound on k such that the information loss due to the divide and conquer algorithm is negligible.

### J an 2 00 8 Near-ideal model selection by l 1 minimization

• Computer Science
• 2008
It is proved that solving a simple quadratic program achieves a squared error within a logarithmic factor of the ideal mean squared error one would achieve with an oracle supplying perfect information about which variables should be included in the model and which variable should not.

### Statistical Optimization in High Dimensions

• Computer Science
Oper. Res.
• 2012
This work proposes three algorithms to address the high-dimensional regime, where the number of samples is roughly equal to the dimensionality of the problem, and the noise magnitude may greatly exceed the magnitude of the signal itself.

### Near-ideal model selection by ℓ1 minimization

• Computer Science
• 2009
It is proved that solving a simple quadratic program achieves a squared error within a logarithmic factor of the ideal mean squared error that one would achieve with an oracle supplying perfect information about which variables should and should not be included in the model.

## References

SHOWING 1-10 OF 117 REFERENCES

### LASSO-TYPE RECOVERY OF SPARSE REPRESENTATIONS FOR HIGH-DIMENSIONAL DATA

• Computer Science
• 2009
Even though the Lasso cannot recover the correct sparsity pattern, the estimator is still consistent in the ‘2-norm sense for fixed designs under conditions on (a) the number sn of non-zero components of the vector n and (b) the minimal singular values of the design matrices that are induced by selecting of order sn variables.

### Sparsity oracle inequalities for the Lasso

• Mathematics, Computer Science
• 2007
It is shown that the penalized least squares estimator satisfies sparsity oracle inequalities, i.e., bounds in terms of the number of non-zero components of the oracle vector, in nonparametric regression setting with random design.

### Persistence in high-dimensional linear predictor selection and the virtue of overparametrization

• Computer Science
• 2004
Under various sparsity assumptions on the optimal predictor there is “asymptotically no harm” in introducing many more explanatory variables than observations, and such practice can be beneficial in comparison with a procedure that screens in advance a small subset of explanatory variables.

### Asymptotics for lasso-type estimators

• Mathematics
• 2000
We consider the asymptotic behavior of regression estimators that minimize the residual sum of squares plus a penalty proportional to Σ ∥β j ∥γ for some y > 0. These estimators include the Lasso as a

### From Model Selection to Adaptive Estimation

• Computer Science
• 1997
Many different model selection information criteria can be found in the literature in various contexts including regression and density estimation to select among a given collection of parametric models that model which minimizes an empirical loss plus some penalty term which is proportional to the dimension of the model.

### A NOTE ON THE LASSO AND RELATED PROCEDURES IN MODEL SELECTION

• Computer Science
• 2006
It is shown that for any sample size n, when there are superuous variables in the linear regression model and the design matrix is orthogonal, the probability that these procedures correctly identify the true set of important variables is less than a constant not depending on n.

### On Model Selection Consistency of Lasso

• Computer Science
J. Mach. Learn. Res.
• 2006
It is proved that a single condition, which is called the Irrepresentable Condition, is almost necessary and sufficient for Lasso to select the true model both in the classical fixed p setting and in the large p setting as the sample size n gets large.

### Sparsity and incoherence in compressive sampling

• Computer Science
• 2007
It is shown that ℓ1 minimization recovers x0 exactly when the number of measurements exceeds S, and μ is the largest entry in U properly normalized: .

### Boosting for high-dimensional linear models

We prove that boosting with the squared error loss, L 2 Boosting, is consistent for very high-dimensional linear models, where the number of predictor variables is allowed to grow essentially as fast

### Information-Theoretic Limits on Sparsity Recovery in the High-Dimensional and Noisy Setting

• M. Wainwright
• Computer Science
IEEE Transactions on Information Theory
• 2009
For a noisy linear observation model based on random measurement matrices drawn from general Gaussian measurementMatrices, this paper derives both a set of sufficient conditions for exact support recovery using an exhaustive search decoder, as well as aset of necessary conditions that any decoder must satisfy for exactSupport set recovery.