#### Filter Results:

- Full text PDF available (43)

#### Publication Year

1965

2017

- This year (9)
- Last 5 years (32)
- Last 10 years (40)

#### Publication Type

#### Co-author

#### Journals and Conferences

#### Key Phrases

Learn More

In high-dimensional linear regression, the goal pursued here is to estimate an unknown regression function using linear combinations of a suitable set of covariates. One of the key assumptions for the success of any statistical procedure in this setup is to assume that the linear combination is sparse in some sense, for example, that it involves only few… (More)

We perform a finite sample analysis of the detection levels for sparse principal components of a high-dimensional covariance matrix. Our minimax optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NP-complete in general, and we describe a computationally efficient alternative test using convex relaxations. Our… (More)

- Quentin Berthet, Philippe Rigollet
- COLT
- 2013

In the context of sparse principal component detection, we bring evidence towards the existence of a statistical price to pay for computational efficiency. We measure the performance of a test by the smallest signal strength that it can detect and we propose a computationally efficient method based on semidefinite programming. We also prove that the… (More)

Given a collection of M different estimators or classifiers, we study the problem of model selection type aggregation, i.e., we construct a new estimator or classifier, called aggregate, which is nearly as good as the best among them with respect to a given risk criterion. We define our aggregate by a simple recursive procedure which solves an auxiliary… (More)

- Philippe Rigollet
- Journal of Machine Learning Research
- 2007

We consider semi-supervised classification when part of the available data is unlabeled. These unlabeled data can be useful for the classification problem when we make an assumption relating the behavior of the regression function to that of the marginal distribution. Seeger (2000) proposed the well-known cluster assumption as a reasonable one. We propose a… (More)

We study the problem of learning the best linear and convex combination of M estimators of a density with respect to the mean squared risk. We suggest aggregation procedures and we prove sharp oracle inequalities for their risks, i.e., oracle inequalities with leading constant 1. We also obtain lower bounds showing that these procedures attain optimal rates… (More)

The information era has witnessed an explosion in the collection of data that contain potentially useful information for a wide range of applications such as biology, sociology, pattern recognition, marketing and finance. As a result of this inflation, statisticians have developed a new set of tools that combine fundamental statistical concepts with… (More)

In a regression setup with deterministic design, we study the pure aggregation problem and introduce a natural extension from the Gaussian distribution to distributions in the exponential family. While this extension bears strong connections with generalized linear models, it does not require identifiability of the parameter or even that the model on the… (More)

- Quentin Berthet, Philippe Rigollet
- ArXiv
- 2013

In the context of sparse principal component detection, we bring evidence towards the existence of a statistical price to pay for computational efficiency. We measure the performance of a test by the smallest signal strength that it can detect and we propose a computationally efficient method based on semidefinite programming. We also prove that the… (More)

In the context of density level set estimation, we study the convergence of general plug-in methods under two main assumptions on the density for a given level λ. More precisely, it is assumed that the density (i) is smooth in a neighborhood of λ and (ii) has γ -exponent at level λ. Condition (i) ensures that the density can be estimated at a standard… (More)