Adaptive learning rates for support vector machines working on data with low intrinsic dimension

  title={Adaptive learning rates for support vector machines working on data with low intrinsic dimension},
  author={Thomas Hamm and Ingo Steinwart},
  journal={The Annals of Statistics},
We derive improved regression and classification rates for support vector machines using Gaussian kernels under the assumption that the data has some low-dimensional intrinsic structure that is described by the box-counting dimension. Under some standard regularity assumptions for regression and classification we prove learning rates, in which the dimension of the ambient space is replaced by the box-counting dimension of the support of the data generating distribution. In the regression case… 
Minimax Optimal Deep Neural Network Classifiers Under Smooth Decision Boundary
It is shown that DNN classifiers can adapt to low-dimensional data structures and circumvent the “curse of dimensionality” in the sense that the minimax rate only depends on the effective dimension, potentially much smaller than the actual data dimension.
Relationship between Permeability Coefficient and Fractal Dimension of Pore in Ionic Rare Earth Magnesium Salt Leaching Ore
The change of permeability coefficient of ionic rare earth ore is one of the most important factors causing the uncontrollable flow of leaching solution, and the variation of pore structure of the
Your Contrastive Learning Is Secretly Doing Stochastic Neighbor Embedding
This work contributes to the theoretical understanding of SSCL and uncover its connection to the classic data visualization method, stochastic neighbor embedding, and demonstrates that the modifications from SNE to t -SNE can also be adopted in the SSCL setting, achieving improvement in both in-dist distribution and out-of-distribution generalization.
Dimensionality Reduction and Wasserstein Stability for Kernel Regression
A novel stability result of kernel regression with respect to the Wasserstein distance is derived, which allows us to bound errors that occur when perturbed input data is used to fit a kernel function.
Nonparametric goodness‐of‐fit testing for parametric covariate models in pharmacometric analyses
This manuscript derives and evaluates nonparametric goodness‐of‐fit tests for parametric covariate models, the null hypothesis, against a kernelized Tikhonov regularized alternative, transferring concepts from statistical learning to the pharmacological setting.
Interpolation and Learning with Scale Dependent Kernels
This work considers the common case of estimators defined by scale dependent kernels, and focuses on the role of the scale in these estimators, which interpolate the data and the scale can be shown to control their stability through the condition number.
Sample complexity and effective dimension for regression on manifolds
A novel nonasymptotic version of the Weyl law from differential geometry is established, able to show that certain spaces of smooth functions on a manifold are effectively finite-dimensional, with a complexity that scales according to the manifold dimension rather than any ambient data dimension.
Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent
A large class of one-dimensional data-generating distributions for which, with high probability, gradient descent only finds a bad local minimum of the optimization landscape, turns out that in these cases, the found network essentially performs linear regression even if the target function is non-linear.


Learning and approximation by Gaussians on Riemannian manifolds
It is shown that the convolution with the Gaussian kernel with variance σ provides the uniform approximation order of O(σs) when the approximated function is Lipschitz s ∈(0, 1].
We confirm by the multi-Gaussian support vector machine (SVM) classification that the information of the intrinsic dimension of Riemannian manifolds can be used to illustrate the efficiency (learning
A tree-based regressor that adapts to intrinsic dimension
The concept of entropy constitutes, together with energy, a cornerstone of contemporary physics and related areas. It was originally introduced by Clausius in 1865 along abstract lines focusing on
Improved Classification Rates for Localized SVMs
This work observes that a margin condition that relates the distance to the decision boundary to the amount of noise is crucial to obtain rates and shows that these rates are obtained adaptively, that is, without knowing the parameters resulting from the margin conditions.
Entropy, Compactness and the Approximation of Operators
1. Entropy quantities 2. Approximation quantities 3. Inequalities of Bernstein-Jackson type 3. Inequalities of Berstein-Jackson type 4. A refined Riesz theory 5. Operators with values in C(X) 6.
The fractal dimension of the Lorenz attractor
On the concept of attractor
This note proposes a definition for the concept of “attractor,” based on the probable asymptotic behavior of orbits. The definition is sufficiently broad so that every smooth compact dynamical system
Dimension of chaotic attractors