# Linear regression through PAC-Bayesian truncation

@article{Audibert2011LinearRT, title={Linear regression through PAC-Bayesian truncation}, author={Jean-Yves Audibert and Olivier Catoni}, journal={arXiv: Statistics Theory}, year={2011} }

We consider the problem of predicting as well as the best linear combination of d given functions in least squares regression under L^\infty constraints on the linear combination. When the input distribution is known, there already exists an algorithm having an expected excess risk of order d/n, where n is the size of the training data. Without this strong assumption, standard results often contain a multiplicative log(n) factor, complex constants involving the conditioning of the Gram matrix…

## 13 Citations

### Robust linear least squares regression

- Mathematics, Computer Science
- 2011

A new estimator is provided based on truncating differences of losses in a min-max framework and satisfies a d/n risk bound both in expectation and in deviations, which is the absence of exponential moment condition on the output distribution while achieving exponential deviations.

### Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices

- Mathematics, Computer ScienceThe Annals of Statistics
- 2022

The main technical contribution is the study of the lower tail of the smallest singular value of empirical covariance matrices around $0, which establishes a lower bound on this lower tail, valid for any distribution in dimension $d \geq 2$, together with a matching upper bound under a necessary regularity condition.

### Distribution-Free Robust Linear Regression

- MathematicsMathematical Statistics and Learning
- 2022

Using the ideas of truncated least squares, median-of-means procedures, and aggregation theory, a non-linear estimator achieving excess risk of order d/n with an optimal sub-exponential tail is constructed.

### Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors

- MathematicsBernoulli
- 2023

A bounded distribution is constructed such that the constrained least squares estimator incurs an excess risk of order $\Omega(d^{3/2}/n)$ hence refuting a recent conjecture of Ohad Shamir [JMLR 2015], and it is observed that non-linear predictors can achieve the optimal rate of O(d/n).

### `Basic' Generalization Error Bounds for Least Squares Regression with Well-specified Models

- Mathematics
- 2021

This note examines the behavior of generalization capabilities as defined by out-of-sample mean squared error (MSE) of Linear Gaussian (with a fixed design matrix) and Linear Least Squares…

### From robust tests to Bayes-like posterior distributions

- Mathematics
- 2021

In the Bayes paradigm and for a given loss function, we propose the construction of a new type of posterior distributions for estimating the law of an n-sample. The loss functions we have in mind are…

### Adaptive Minimax Estimation over Sparse l q-Hulls

- Computer Science
- 2011

The results show that the minimax rate of $\ell_q$-aggregation is basically determined by an effective model size, which is a sparsity index that depends on $q$, $t_n$, $M_n$ and the sample size $n$ in an easily interpretable way based on a classical model selection theory that deals with a large number of models.

### Hard-Margin Active Linear Regression

- Computer Science, MathematicsICML
- 2014

It is shown that active learning admits significantly better sample complexity bounds than the passive learning counterpart, and give efficient algorithms that attain near-optimal bounds.

### PAC-Bayesian estimation and prediction in sparse additive models

- Computer ScienceElectronic Journal of Statistics
- 2013

A PAC-Bayesian strategy is investigated, delivering oracle inequalities in probability in high-dimensional additive models under a sparsity assumption, and its performance is assessed on simulated data.

### Dimension-free Bounds for Sums of Independent Matrices and Simple Tensors via the Variational Principle

- Mathematics, Computer Science
- 2021

This work considers the deviation inequalities for the sums of independent d by d random matrices, as well as rank one random tensors, and presents the bounds that do not depend explicitly on the dimension d, but rather on the e-ective rank.

## References

SHOWING 1-10 OF 30 REFERENCES

### Robust linear least squares regression

- Mathematics, Computer Science
- 2011

A new estimator is provided based on truncating differences of losses in a min-max framework and satisfies a d/n risk bound both in expectation and in deviations, which is the absence of exponential moment condition on the output distribution while achieving exponential deviations.

### Fast learning rates in statistical inference through aggregation

- Computer Science
- 2009

We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set G up to the smallest possible additive term, called the…

### Pac-Bayesian Bounds for Sparse Regression Estimation with Exponential Weights

- Computer Science, Mathematics
- 2011

A sparsity oracle inequality in probability for the true excess risk for a version of exponential weight estimator is presented and aMCMC method is proposed to compute the estimator for reasonably large values of p.

### PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning

- Computer Science
- 2007

An alternative selection scheme based on relative bounds between estimators is described and study, and a two step localization technique which can handle the selection of a parametric model from a family of those is presented.

### Aggregating Regression Procedures for a Better Performance

- Mathematics
- 1999

Methods have been proposed to linearly combine candidate regression procedures to improve estimation accuraccy. Applications of these methods in many examples are very succeesful, pointing to the…

### Model selection for regression on a fixed design

- Mathematics, Computer Science
- 2000

This work considers some collection of finite dimensional linear spaces and the least-squares estimator built on a data driven selected model among this collection and deduce adaptivity properties from which the estimator from which it is derived holds under mild moment conditions on the errors.

### Optimal Rates for the Regularized Least-Squares Algorithm

- Mathematics, Computer ScienceFound. Comput. Math.
- 2007

A complete minimax analysis of the problem is described, showing that the convergence rates obtained by regularized least-squares estimators are indeed optimal over a suitable class of priors defined by the considered kernel.

### PAC-Bayesian bounds for randomized empirical risk minimizers

- Computer Science, Mathematics
- 2007

The aim of this paper is to generalize the PAC-Bayesian theorems proved by Catoni in the classification setting to more general problems of statistical inference, and to bound the risk of very general estimation procedures.

### Challenging the empirical mean and empirical variance: a deviation study

- Mathematics
- 2010

We present new M-estimators of the mean and variance of real valued random variables, based on PAC-Bayes bounds. We analyze the non-asymptotic minimax properties of the deviations of those estimators…