Estimation error analysis of deep learning on the regression problem on the variable exponent Besov space

  title={Estimation error analysis of deep learning on the regression problem on the variable exponent Besov space},
  author={Kazuma Tsuji and Taiji Suzuki},
Deep learning has achieved notable success in various fields, including image and speech recognition. One of the factors in the successful performance of deep learning is its high feature extraction ability. In this study, we focus on the adaptivity of deep learning; consequently, we treat the variable exponent Besov space, which has a different smoothness depending on the input location $x$. In other words, the difficulty of the estimation is not uniform within the domain. We analyze the… 

Figures and Tables from this paper

Adaptive deep learning for nonparametric time series regression

A general theory for adaptive nonparametric estimation of mean functions of nonstationary and nonlinear time series using deep neural networks (DNNs) is developed and the usefulness of the DNN methods for estimating nonlinear AR models with intrinsic low-dimensional structures and discontinuous or rough mean functions is demonstrated.

On the inability of Gaussian process regression to optimally learn compositional functions

We rigorously prove that deep Gaussian process priors can outperform Gaussian process priors if the target function has a compositional structure. To this end, we study information-theoretic lower

Nonconvex Sparse Regularization for Deep Neural Networks and Its Optimality

It is proved that the sparse-penalized DNN estimator can adaptively attain minimax convergence rates for various nonparametric regression problems and an efficient gradient-based optimization algorithm is developed that guarantees the monotonic reduction of the objective function.

Drift estimation for a multi-dimensional diffusion process using deep neural networks

A deep neural network method to estimate the drift coefficient of a multi-dimensional diffusion process from discrete observations is studied and it is shown that they achieve the minimax rate of convergence up to a logarithmic factor when the drift function has a compositional structure.



Adaptivity of deep ReLU network for learning in Besov and mixed smooth Besov spaces: optimal rate and curse of dimensionality

A new approximation and estimation error analysis of deep learning with the ReLU activation for functions in a Besov space and its variant with mixed smoothness shows that deep learning has higher adaptivity to the spatial inhomogeneity of the target function than other estimators such as linear ones.

Deep learning is adaptive to intrinsic dimensionality of model smoothness in anisotropic Besov space

The results show that deep learning has better dependence on the input dimensionality if the target function possesses anisotropic smoothness, and it achieves an adaptive rate for functions with spatially inhomogeneous smoothness.

Deep Neural Networks Learn Non-Smooth Functions Effectively

It is shown that the estimators by DNNs are almost optimal to estimate the non-smooth functions, while some of the popular models do not attain the optimal rate.

Minimax estimation via wavelet shrinkage

A nonlinear method which works in the wavelet domain by simple nonlinear shrinkage of the empirical wavelet coefficients is developed, andVariants of this method based on simple threshold nonlinear estimators are nearly minimax.

ImageNet classification with deep convolutional neural networks

A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

Decision Theoretic Generalizations of the PAC Model for Neural Net and Other Learning Applications

  • D. Haussler
  • Mathematics, Computer Science
    Inf. Comput.
  • 1992

Density estimation by wavelet thresholding

Density estimation is a commonly used test case for nonparametric estimation methods. We explore the asymptotic properties of estimators based on thresholding of empirical wavelet coefficients.

Error bounds for approximations with deep ReLU networks

Density estimation in Besov spaces