Learn More
The first part of this paper proposes an adaptive, data-driven threshold for image denoising via wavelet soft-thresholding. The threshold is derived in a Bayesian framework, and the prior used on the wavelet coefficients is the generalized Gaussian distribution (GGD) widely used in image processing applications. The proposed threshold is simple and(More)
Sparsity or parsimony of statistical models is crucial for their proper interpretations, as in sciences and social sciences. Model selection is a commonly used method to find such models, but usually involves a computationally heavy combinatorial search. Lasso (Tibshirani, 1996) is now being used as a computationally feasible alternative to model selection.(More)
High-dimensional statistical inference deals with models in which the the number of parameters p is comparable to or larger than the sample size n. Since it is usually impossible to obtain consistent procedures unless p/n → 0, a line of recent work has studied models with various types of low-dimensional structure, including sparse vectors, sparse and(More)
— We review the principles of Minimum Description Length and Stochastic Complexity as used in data compression and statistical modeling. Stochastic complexity is formulated as the solution to optimum universal coding problems extending Shannon's basic source coding theorem. The normalized maximized likelihood, mixture, and predictive codings are each shown(More)
  • BY NICOLAI MEINSHAUSEN, BIN YU
  • 2006
The Lasso is an attractive technique for regularization and variable selection for high-dimensional data, where the number of predictor variables p n is potentially much larger than the number of samples n. However, it was recently discovered that the sparsity pattern of the Lasso estimator can only be asymptotically identical to the true sparsity pattern(More)
Given i.i.d. observations of a random vector X ∈ R p , we study the problem of estimating both its covariance matrix Σ * , and its inverse covariance or concentration matrix Θ * = (Σ *) −1. When X is multivari-ate Gaussian, the non-zero structure of Θ * is specified by the graph of an associated Gaussian Markov random field; and a popular estimator for such(More)
—Using in situ hyperspectral measurements collected in the Sierra Nevada Mountains in California, we discriminate six species of conifer trees using a recent, nonparametric statistics technique known as penalized discriminant analysis (PDA). A classification accuracy of 76% is obtained. Our emphasis is on providing an intuitive, geometric description of PDA(More)
Recently much attention has been devoted to model selection through regularization methods in regression and classification where features are selected by use of a penalty function (e.g. Lasso in Tibshirani, 1996). While the resulting sparsity leads to more interpretable models, one may want to further incorporate natural groupings or hierarchical(More)
The method of wavelet thresholding for removing noise, or denoising, has been researched extensively due to its effectiveness and simplicity. Much of the literature has focused on developing the best uniform threshold or best basis selection. However, not much has been done to make the threshold values adaptive to the spatially changing statistics of(More)
  • Peter B Uhlmann, Eth Z Urich, Bin Yu
  • 2002
This paper investigates a computationally simple variant of boosting, L 2 Boost, which is constructed from a functional gradient descent algorithm with the L 2-loss function. As other boosting algorithms, L 2 Boost uses many times in an iterative fashion a pre-chosen tting method, called the learner. Based on the explicit expression of reetting of residuals(More)