• Corpus ID: 3586027

Linear regression through PAC-Bayesian truncation

  title={Linear regression through PAC-Bayesian truncation},
  author={Jean-Yves Audibert and Olivier Catoni},
  journal={arXiv: Statistics Theory},
We consider the problem of predicting as well as the best linear combination of d given functions in least squares regression under L^\infty constraints on the linear combination. When the input distribution is known, there already exists an algorithm having an expected excess risk of order d/n, where n is the size of the training data. Without this strong assumption, standard results often contain a multiplicative log(n) factor, complex constants involving the conditioning of the Gram matrix… 

Robust linear least squares regression

A new estimator is provided based on truncating differences of losses in a min-max framework and satisfies a d/n risk bound both in expectation and in deviations, which is the absence of exponential moment condition on the output distribution while achieving exponential deviations.

Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices

The main technical contribution is the study of the lower tail of the smallest singular value of empirical covariance matrices around $0, which establishes a lower bound on this lower tail, valid for any distribution in dimension $d \geq 2$, together with a matching upper bound under a necessary regularity condition.

Distribution-Free Robust Linear Regression

Using the ideas of truncated least squares, median-of-means procedures, and aggregation theory, a non-linear estimator achieving excess risk of order d/n with an optimal sub-exponential tail is constructed.

Suboptimality of Constrained Least Squares and Improvements via Non-Linear Predictors

A bounded distribution is constructed such that the constrained least squares estimator incurs an excess risk of order $\Omega(d^{3/2}/n)$ hence refuting a recent conjecture of Ohad Shamir [JMLR 2015], and it is observed that non-linear predictors can achieve the optimal rate of O(d/n).

`Basic' Generalization Error Bounds for Least Squares Regression with Well-specified Models

This note examines the behavior of generalization capabilities as defined by out-of-sample mean squared error (MSE) of Linear Gaussian (with a fixed design matrix) and Linear Least Squares

From robust tests to Bayes-like posterior distributions

In the Bayes paradigm and for a given loss function, we propose the construction of a new type of posterior distributions for estimating the law of an n-sample. The loss functions we have in mind are

Adaptive Minimax Estimation over Sparse l q-Hulls

The results show that the minimax rate of $\ell_q$-aggregation is basically determined by an effective model size, which is a sparsity index that depends on $q$, $t_n$, $M_n$ and the sample size $n$ in an easily interpretable way based on a classical model selection theory that deals with a large number of models.

Hard-Margin Active Linear Regression

It is shown that active learning admits significantly better sample complexity bounds than the passive learning counterpart, and give efficient algorithms that attain near-optimal bounds.

PAC-Bayesian estimation and prediction in sparse additive models

A PAC-Bayesian strategy is investigated, delivering oracle inequalities in probability in high-dimensional additive models under a sparsity assumption, and its performance is assessed on simulated data.

Dimension-free Bounds for Sums of Independent Matrices and Simple Tensors via the Variational Principle

This work considers the deviation inequalities for the sums of independent d by d random matrices, as well as rank one random tensors, and presents the bounds that do not depend explicitly on the dimension d, but rather on the e-ective rank.



Robust linear least squares regression

A new estimator is provided based on truncating differences of losses in a min-max framework and satisfies a d/n risk bound both in expectation and in deviations, which is the absence of exponential moment condition on the output distribution while achieving exponential deviations.

Sparse Regression Learning by Aggregation and Langevin Monte-Carlo

Fast learning rates in statistical inference through aggregation

We develop minimax optimal risk bounds for the general learning task consisting in predicting as well as the best function in a reference set G up to the smallest possible additive term, called the

Pac-Bayesian Bounds for Sparse Regression Estimation with Exponential Weights

A sparsity oracle inequality in probability for the true excess risk for a version of exponential weight estimator is presented and aMCMC method is proposed to compute the estimator for reasonably large values of p.

PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning

An alternative selection scheme based on relative bounds between estimators is described and study, and a two step localization technique which can handle the selection of a parametric model from a family of those is presented.

Aggregating Regression Procedures for a Better Performance

Methods have been proposed to linearly combine candidate regression procedures to improve estimation accuraccy. Applications of these methods in many examples are very succeesful, pointing to the

Model selection for regression on a fixed design

This work considers some collection of finite dimensional linear spaces and the least-squares estimator built on a data driven selected model among this collection and deduce adaptivity properties from which the estimator from which it is derived holds under mild moment conditions on the errors.

Optimal Rates for the Regularized Least-Squares Algorithm

A complete minimax analysis of the problem is described, showing that the convergence rates obtained by regularized least-squares estimators are indeed optimal over a suitable class of priors defined by the considered kernel.

PAC-Bayesian bounds for randomized empirical risk minimizers

The aim of this paper is to generalize the PAC-Bayesian theorems proved by Catoni in the classification setting to more general problems of statistical inference, and to bound the risk of very general estimation procedures.

Challenging the empirical mean and empirical variance: a deviation study

We present new M-estimators of the mean and variance of real valued random variables, based on PAC-Bayes bounds. We analyze the non-asymptotic minimax properties of the deviations of those estimators