Corpus ID: 8049057

Provable Bounds for Learning Some Deep Representations

@article{Arora2014ProvableBF,
  title={Provable Bounds for Learning Some Deep Representations},
  author={Sanjeev Arora and Aditya Bhaskara and Rong Ge and Tengyu Ma},
  journal={ArXiv},
  year={2014},
  volume={abs/1310.6343}
}
We give algorithms with provable guarantees that learn a class of deep nets in the generative model view popularized by Hinton and others. Our generative model is an n node multilayer network that has degree at most nγ for some γ < 1 and each edge has a random edge weight in [-1, 1]. Our algorithm learns almost all networks in this class with polynomial running time. The sample complexity is quadratic or cubic depending upon the details of the model. The algorithm uses layerwise learning. It… Expand
Algorithms Reading Group Notes: Provable Bounds for Learning Deep Representations
Continuing from last week, we again examine provable algorithms for learning neural networks. This time, we will consider the following problem: we draw a random ground truth network and choose someExpand
On the Learnability of Fully-Connected Neural Networks
TLDR
This paper characterize the learnability of fullyconnected neural networks via both positive and negative results, and establishes a hardness result showing that the exponential dependence on 1/ is unavoidable unless RP = NP. Expand
Recovering the Lowest Layer of Deep Networks with High Threshold Activations
TLDR
This work addresses the problem of uncovering the lowest layer in a deep neural network under the assumption that the highest layer uses a high threshold before applying the activation, the upper network can be modeled as a well-behaved polynomial and the input distribution is Gaussian. Expand
Deep Stochastic Configuration Networks with Universal Approximation Property
  • Dianhui Wang, Ming Li
  • Computer Science, Mathematics
  • 2018 International Joint Conference on Neural Networks (IJCNN)
  • 2018
This paper develops a randomized approach for incrementally building deep neural networks, where a superviso- ry mechanism is proposed to constrain the random assignment of the weights and biases,Expand
SGD Learns the Conjugate Kernel Class of the Network
We show that the standard stochastic gradient decent (SGD) algorithm is guaranteed to learn, in polynomial time, a function that is competitive with the best function in the conjugate kernel space ofExpand
A Provably Correct Algorithm for Deep Learning that Actually Works
We describe a layer-by-layer algorithm for training deep convolutional networks, where each step involves gradient updates for a two layer network followed by a simple clustering algorithm. OurExpand
M L ] 2 J un 2 01 8 Autoencoders Learn Generative Linear Models
Recent progress in learning theory has led to the emergence of provable algorithms for training certain families of neural networks. Under the assumption that the training data is sampled from aExpand
Deep Stochastic Configuration Networks: Universal Approximation and Learning Representation
TLDR
A supervisory mechanism is proposed to constrain the random assignment of the hidden parameters (i.e., all biases and weights within the hidden layers), and full-rank oriented criterion is suggested and utilized as a termination condition to determine the number of nodes for each hidden layer. Expand
Learning Compact Neural Networks with Regularization
TLDR
This work proposes and analyzes regularized gradient descent algorithms for learning shallow networks and shows how regularization can be beneficial to overcome overparametrization. Expand
Provable approximation properties for deep neural networks
TLDR
A sparsely-connected depth-4 neural network that computes wavelet functions, which are computed from Rectified Linear Units (ReLU) and bound its error in approximating functions. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 27 REFERENCES
A Provably Efficient Algorithm for Training Deep Networks
TLDR
The main goal of this paper is the derivation of a provably efficient, layer-by-layer, algorithm for training deep neural networks, which is denote as the Basis Learner, which comes with formal polynomial time convergence guarantees. Expand
Learning Deep Architectures for AI
TLDR
The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed. Expand
Representation Learning: A Review and New Perspectives
TLDR
Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks. Expand
Learnability Beyond AC 0
In a celebrated result Linial et al. [3] gave an algorithm which learns size-s depth-d AND/OR/NOT circuits in time n s) d from uniformly distributed random examples on the Boolean cube {0, 1}.Expand
Learnability beyond AC0
TLDR
This is the first algorithm for learning a more expressive circuit class than the class AC0 of constant-depth polynomial-size circuits, a class which was shown to be learnable in quasipolynomial time by Linial, Mansour and Nisan in 1989. Expand
Settling the Polynomial Learnability of Mixtures of Gaussians
  • Ankur Moitra, G. Valiant
  • Computer Science, Mathematics
  • 2010 IEEE 51st Annual Symposium on Foundations of Computer Science
  • 2010
TLDR
This paper gives the first polynomial time algorithm for proper density estimation for mixtures of k Gaussians that needs no assumptions on the mixture, and proves that such a dependence is necessary. Expand
Extracting and composing robust features with denoising autoencoders
TLDR
This work introduces and motivate a new training principle for unsupervised learning of a representation based on the idea of making the learned representations robust to partial corruption of the input pattern. Expand
On Random Weights and Unsupervised Feature Learning
TLDR
The answer is that certain convolutional pooling architectures can be inherently frequency selective and translation invariant, even with random weights, and the viability of extremely fast architecture search is demonstrated by using random weights to evaluate candidate architectures, thereby sidestepping the time-consuming learning process. Expand
Kernel Methods for Deep Learning
TLDR
A new family of positive-definite kernel functions that mimic the computation in large, multilayer neural nets are introduced that can be used in shallow architectures, such as support vector machines (SVMs), or in deep kernel-based architectures that the authors call multilayers kernel machines (MKMs). Expand
Cryptographic Hardness for Learning Intersections of Halfspaces
TLDR
The first representation-independent hardness results for PAC learning intersections of halfspaces are given, derived from two public-key cryptosystems due to Regev, which are based on the worst-case hardness of well-studied lattice problems. Expand
...
1
2
3
...