# Deep Learning and MARS: A Connection

@article{Kohler2019DeepLA, title={Deep Learning and MARS: A Connection}, author={Michael Kohler and Adam Krzyżak and Sophie Langer}, journal={ArXiv}, year={2019}, volume={abs/1908.11140} }

We consider least squares regression estimates using deep neural networks. We show that these estimates satisfy an oracle inequality, which implies that (up to a logarithmic factor) the error of these estimates is at least as small as the optimal possible error bound which one would expect for MARS in case that this procedure would work in the optimal way. As a result we show that our neural networks are able to achieve a dimensionality reduction in case that the regression function locally has…

## 6 Citations

### Analysis of the rate of convergence of neural network regression estimates which are easy to implement.

- Computer Science, Mathematics
- 2019

This article introduces a new neural network regression estimate where most of the weights are chosen regardless of the data motivated by some recent approximation results for neural networks, and which is therefore easy to implement and which achieves the one-dimensional rate of convergence.

### On the rate of convergence of a neural network regression estimate learned by gradient descent

- Computer Science, Mathematics
- 2019

It is shown that the resulting estimate achieves (up to a logarithmic factor) the optimal rate of convergence in a projection pursuit model.

### Over-parametrized deep neural networks do not generalize well

- Computer Science
- 2019

A lower bound is presented which proves that deep neural networks with the sigmoidal squasher activation function in a regression setting do not generalize well on a new data in the sense that they do not achieve the optimal minimax rate of convergence for estimation of smooth regression functions.

### Discussion of: “Nonparametric regression using deep neural networks with ReLU activation function”

- Computer ScienceThe Annals of Statistics
- 2020

Kohler and Krzyżak (2017) extended this function class in form of the so-called generalized hierarchical interaction models introduced as follows.

### On the rate of convergence of image classifiers based on convolutional neural networks

- Computer ScienceArXiv
- 2020

This work proves that in image classification it is possible to circumvent the curse of dimensionality by convolutional neural networks.

### Approximating smooth functions by deep neural networks with sigmoid activation function

- Computer ScienceJ. Multivar. Anal.
- 2021

## References

SHOWING 1-10 OF 43 REFERENCES

### On deep learning as a remedy for the curse of dimensionality in nonparametric regression

- Computer ScienceThe Annals of Statistics
- 2019

It is shown that least squares estimates based on multilayer feedforward neural networks are able to circumvent the curse of dimensionality in nonparametric regression.

### Nonparametric regression using deep neural networks with ReLU activation function

- Computer ScienceArXiv
- 2017

The theory suggests that for nonparametric regression, scaling the network depth with the sample size is natural and the analysis gives some insights into why multilayer feedforward neural networks perform well in practice.

### Deep Neural Networks Learn Non-Smooth Functions Effectively

- Computer ScienceAISTATS
- 2019

It is shown that the estimators by DNNs are almost optimal to estimate the non-smooth functions, while some of the popular models do not attain the optimal rate.

### Local Dimensionality Reduction

- Computer Science, MathematicsNIPS
- 1997

This paper examines several techniques for local dimensionality reduction in the context of locally weighted linear regression and finds that locally weighted partial least squares regression offers the best average results, thus outperforming even factor analysis, the theoretically most appealing of the candidate techniques.

### Convergence rates for single hidden layer feedforward networks

- Computer Science, MathematicsNeural Networks
- 1994

### Local dimensionality reduction for locally weighted learning

- Computer ScienceProceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation CIRA'97. 'Towards New Computational Principles for Robotics and Automation'
- 1997

A learning algorithm is derived that exploits a dynamically growing local dimensionality reduction as a preprocessing step with a nonparametric learning technique, locally weighted regression, that exploits data distributions from physical movement systems are locally low dimensional and dense.

### Rate-optimal estimation for a general class of nonparametric regression models with unknown link functions

- Mathematics, Computer Science
- 2007

A nonparametric regression model that naturally generalizes neural network models is discussed that is based on a finite number of one-dimensional transformations and can be estimated with a one- dimensional rate of convergence.

### Investigating Smooth Multiple Regression by the Method of Average Derivatives

- Mathematics
- 2015

Abstract Let (x 1, …, xk, y) be a random vector where y denotes a response on the vector x of predictor variables. In this article we propose a technique [termed average derivative estimation (ADE)]…

### Approximation and estimation bounds for artificial neural networks

- Computer ScienceMachine Learning
- 2004

The analysis involves Fourier techniques for the approximation error, metric entropy considerations for the estimation error, and a calculation of the index of resolvability of minimum complexity estimation of the family of networks.

### Local Dimensionality Reduction for Non-Parametric Regression

- Computer ScienceNeural Processing Letters
- 2009

Locally-weighted regression is a computationally-efficient technique for non-linear regression. However, for high-dimensional data, this technique becomes numerically brittle and computationally too…