Sharp thresholds for high-dimensional and noisy recovery of sparsity
The problem of consistently estimating the sparsity pattern of a vector β * ∈ R p based on observations contaminated by noise arises in various contexts, including subset selection in regression, structure estimation in graphical models, sparse approximation, and signal denoising. We analyze the behavior of ℓ 1-constrained quadratic programming (QP), also referred to as the Lasso, for recovering the sparsity pattern. Our main result is to establish a sharp relation between the problem dimension p, the number s of non-zero elements in β * , and the number of observations n that are required for reliable recovery. For a broad class of Gaussian ensembles satisfying mutual incoherence conditions, we establish existence and compute explicit values of thresholds θ ℓ and θ u with the following properties: for any ν > 0, if n > 2 (θ u +ν) log(p−s)+s+1, then the Lasso succeeds in recovering the sparsity pattern with probability converging to one for large problems, whereas for n < 2 (θ ℓ − ν) log(p − s) + s + 1, then the probability of successful recovery converges to zero. For the special case of the uniform Gaussian ensemble, we show that θ ℓ = θ u = 1, so that the threshold is sharp and exactly determined.