#### Filter Results:

#### Publication Year

1992

2016

#### Co-author

#### Key Phrase

#### Publication Venue

Learn More

— We review the principles of Minimum Description Length and Stochastic Complexity as used in data compression and statistical modeling. Stochastic complexity is formulated as the solution to optimum universal coding problems extending Shannon's basic source coding theorem. The normalized maximized likelihood, mixture, and predictive codings are each shown… (More)

Sparsity or parsimony of statistical models is crucial for their proper interpretations, as in sciences and social sciences. Model selection is a commonly used method to find such models, but usually involves a computationally heavy combinatorial search. Lasso (Tibshirani, 1996) is now being used as a computationally feasible alternative to model selection.… (More)

The first part of this paper proposes an adaptive, data-driven threshold for image denoising via wavelet soft-thresholding. The threshold is derived in a Bayesian framework, and the prior used on the wavelet coefficients is the generalized Gaussian distribution (GGD) widely used in image processing applications. The proposed threshold is simple and… (More)

High-dimensional statistical inference deals with models in which the the number of parameters p is comparable to or larger than the sample size n. Since it is usually impossible to obtain consistent procedures unless p/n → 0, a line of recent work has studied models with various types of low-dimensional structure, including sparse vectors, sparse and… (More)

—Using in situ hyperspectral measurements collected in the Sierra Nevada Mountains in California, we discriminate six species of conifer trees using a recent, nonparametric statistics technique known as penalized discriminant analysis (PDA). A classification accuracy of 76% is obtained. Our emphasis is on providing an intuitive, geometric description of PDA… (More)

The method of wavelet thresholding for removing noise, or denoising, has been researched extensively due to its effectiveness and simplicity. Much of the literature has focused on developing the best uniform threshold or best basis selection. However, not much has been done to make the threshold values adaptive to the spatially changing statistics of… (More)

Consider the standard linear regression model Y = Xβ * +w, where Y ∈ R n is an observation vector, X ∈ R n×d is a design matrix, β * ∈ R d is the unknown regression vector, and w ∼ N (0, σ 2 I) is additive Gaussian noise. This paper studies the minimax rates of convergence for estimation of β * for p-losses and in the 2-prediction loss, assuming that β *… (More)

Sparse additive models are families of d-variate functions that have the additive decomposition f * = j∈S f * j , where S is a unknown subset of cardinality s d. We consider the case where each component function f * j lies in a reproducing kernel Hilbert space, and analyze a simple kernel-based convex program for estimating the unknown function f *.… (More)