#### Filter Results:

#### Publication Year

1992

2016

#### Publication Type

#### Co-author

#### Publication Venue

#### Data Set Used

#### Key Phrases

Learn More

Sparsity or parsimony of statistical models is crucial for their proper interpretations, as in sciences and social sciences. Model selection is a commonly used method to find such models, but usually involves a computationally heavy combinatorial search. Lasso (Tibshirani, 1996) is now being used as a computationally feasible alternative to model selection.… (More)

The first part of this paper proposes an adaptive, data-driven threshold for image denoising via wavelet soft-thresholding. The threshold is derived in a Bayesian framework, and the prior used on the wavelet coefficients is the generalized Gaussian distribution (GGD) widely used in image processing applications. The proposed threshold is simple and… (More)

High-dimensional statistical inference deals with models in which the the number of parameters p is comparable to or larger than the sample size n. Since it is usually impossible to obtain consistent procedures unless p/n → 0, a line of recent work has studied models with various types of low-dimensional structure, including sparse vectors, sparse and… (More)

— We review the principles of Minimum Description Length and Stochastic Complexity as used in data compression and statistical modeling. Stochastic complexity is formulated as the solution to optimum universal coding problems extending Shannon's basic source coding theorem. The normalized maximized likelihood, mixture, and predictive codings are each shown… (More)

- BY NICOLAI MEINSHAUSEN, BIN YU
- 2006

The Lasso is an attractive technique for regularization and variable selection for high-dimensional data, where the number of predictor variables p n is potentially much larger than the number of samples n. However, it was recently discovered that the sparsity pattern of the Lasso estimator can only be asymptotically identical to the true sparsity pattern… (More)

Given i.i.d. observations of a random vector X ∈ R p , we study the problem of estimating both its covariance matrix Σ * , and its inverse covariance or concentration matrix Θ * = (Σ *) −1. When X is multivari-ate Gaussian, the non-zero structure of Θ * is specified by the graph of an associated Gaussian Markov random field; and a popular estimator for such… (More)

—Using in situ hyperspectral measurements collected in the Sierra Nevada Mountains in California, we discriminate six species of conifer trees using a recent, nonparametric statistics technique known as penalized discriminant analysis (PDA). A classification accuracy of 76% is obtained. Our emphasis is on providing an intuitive, geometric description of PDA… (More)

The method of wavelet thresholding for removing noise, or denoising, has been researched extensively due to its effectiveness and simplicity. Much of the literature has focused on developing the best uniform threshold or best basis selection. However, not much has been done to make the threshold values adaptive to the spatially changing statistics of… (More)

- Peng Zhao, Guilherme Rocha, Bin Yu
- 2006

Recently much attention has been devoted to model selection through regularization methods in regression and classification where features are selected by use of a penalty function (e.g. Lasso in Tibshirani, 1996). While the resulting sparsity leads to more interpretable models, one may want to further incorporate natural groupings or hierarchical… (More)

- BIN YU
- 2011

Networks or graphs can easily represent a diverse set of data sources that are characterized by interacting units or actors. Social networks, representing people who communicate with each other, are one example. Communities or clusters of highly connected actors form an essential feature in the structure of several empirical networks. Spectral clustering is… (More)