Universal Prediction Band via Semi-Definite Programming

@article{Liang2021UniversalPB,
  title={Universal Prediction Band via Semi-Definite Programming},
  author={Tengyuan Liang},
  journal={ArXiv},
  year={2021},
  volume={abs/2103.17203}
}
  • Tengyuan Liang
  • Published 31 March 2021
  • Mathematics, Computer Science, Economics
  • ArXiv
We propose a computationally efficient method to construct nonparametric, heteroskedastic prediction bands for uncertainty quantification, with or without any user-specified predictive model. The data-adaptive prediction band is universally applicable with minimal distributional assumptions, with strong non-asymptotic coverage properties, and easy to implement using standard convex programs. Our approach can be viewed as a novel variance interpolation with confidence and further leverages… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 27 REFERENCES
Optimal Rates for the Regularized Least-Squares Algorithm
TLDR
A complete minimax analysis of the problem is described, showing that the convergence rates obtained by regularized least-squares estimators are indeed optimal over a suitable class of priors defined by the considered kernel.
Deep learning: a statistical viewpoint
TLDR
This article surveys recent progress in statistical learning theory that provides examples illustrating these principles in simpler settings, and focuses specifically on the linear regime for neural networks, where the network can be approximated by a linear model.
Non-parametric Models for Non-negative Functions
TLDR
This paper provides the first model for non-negative functions which benefits from the same good properties of linear models, and proves that it admits a representer theorem and provide an efficient dual formulation for convex problems.
Benign overfitting in linear regression
TLDR
A characterization of linear regression problems for which the minimum norm interpolating prediction rule has near-optimal prediction accuracy shows that overparameterization is essential for benign overfitting in this setting: the number of directions in parameter space that are unimportant for prediction must significantly exceed the sample size.
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize
TLDR
This work isolates a phenomenon of implicit regularization for minimum-norm interpolated solutions which is due to a combination of high dimensionality of the input data, curvature of the kernel function, and favorable geometric properties of the data such as an eigenvalue decay of the empirical covariance and kernel matrices.
Learning Positive Functions in a Hilbert Space
We develop a method for learning positive functions by optimizing over SoSK, a reproducing kernel Hilbert space (RKHS) subject to a Sum-of-Squares (SoS) constraint. This constraint ensures that only
Shape-Constrained Regression using Sum of Squares Polynomials
TLDR
It is shown how to use semidefinite programming to obtain polynomial regressors that satisfy shape constraints, such as monotonicity with respect to a variable, Lipschitz-continuity, or convexity over a region, and it is shown that the regressors obtained are consistent estimators of the underlying shape-constrained function that maps the feature vectors to the responses.
Global Optimization with Polynomials and the Problem of Moments
  • J. Lasserre
  • Computer Science, Mathematics
    SIAM J. Optim.
  • 2001
TLDR
It is shown that the problem of finding the unconstrained global minimum of a real-valued polynomial p(x): R n to R, in a compact set K defined byPolynomial inequalities reduces to solving an (often finite) sequence of convex linear matrix inequality (LMI) problems.
A tutorial on conformal prediction
TLDR
This tutorial presents a self-contained account of the theory of conformal prediction and works through several numerical examples of how the model under which successive examples are sampled independently from the same distribution can be applied to any method for producing ŷ.
Interpolating Classifiers Make Few Mistakes
TLDR
A mistake bound is derived for MNIC and a regularized variant that holds for all data sets and it is shown that as long as the population class conditional distributions are sufficiently separable in total variation, then MNIC generalizes with a fast rate.
...
1
2
3
...