Learning Deep Architectures for AI

@article{Bengio2007LearningDA,
  title={Learning Deep Architectures for AI},
  author={Yoshua Bengio},
  journal={Found. Trends Mach. Learn.},
  year={2007},
  volume={2},
  pages={1-127}
}
  • Yoshua Bengio
  • Published 2007
  • Computer Science
  • Found. Trends Mach. Learn.
Theoretical results strongly suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g. in vision, language, and other AI-level tasks), one needs deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult optimization task… 
Category: Learning Algorithms Deep Woods
Recent theoretical studies indicate that deep architectures [4, 2] are needed to efficiently model complex distributions and to achieve better generalization performance on challenging recognition
Learning deep generative models
TLDR
The aim of the thesis is to demonstrate that deep generative models that contain many layers of latent variables and millions of parameters can be learned efficiently, and that the learned high-level feature representations can be successfully applied in a wide spectrum of application domains, including visual object recognition, information retrieval, and classification and regression tasks.
Reinforcement Learning with Deep Architectures
There is both theoretical and empirical evidence that deep architectures may be more appropriate than shallow architectures for learning functions which exhibit hierarchical structure, and which can
Understanding Representations Learned in Deep Architectures
Deep architectures have demonstrated state-of-the-art pe rformance in a variety of settings, especially with vision datasets. Deep learning a lgorithms are based on learning several levels of
Exploring Strategies for Training Deep Neural Networks
TLDR
These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy helps the optimization by initializing weights in a region near a good local minimum, but also implicitly acts as a sort of regularization that brings better generalization and encourages internal distributed representations that are high-level abstractions of the input.
Deep Representation Learning with Genetic Programming
A representation can be seen as a set of variables, known as features, that describe a phenomenon. Machine learning (ML) algorithms make use of these representations to achieve the task they are
The Self-Organizing Restricted Boltzmann Machine for Deep Representation with the Application on Classification Problems
TLDR
This paper introduces an approach to determine the number of hidden layers and neurons of the deep network automatically during the learning process and acts as a regularization method since the neurons whose weights are lower than the threshold are removed and thus, RBM learns to copy input merely approximate.
Deep Learning of Representations
TLDR
This chapter reviews the main motivations and ideas behind deep learning algorithms and their representation-learning components, as well as recent results, and proposes a vision of challenges and hopes on the road ahead, focusing on the questions of invariance and disentangling.
A review of deep learning with special emphasis on architectures, applications and recent trends
TLDR
The thrust of this review is to outline emerging applications of DL and provide a reference to researchers seeking to use DL in their work for pattern recognition with unparalleled learning capacity and the ability to scale with data.
Variational approach to unsupervised learning
  • S. Shah
  • Mathematics, Computer Science
    ArXiv
  • 2019
TLDR
It is demonstrated how one can arrive at convolutional deep belief networks as potential solution to unsupervised learning problems without making assumptions about the underlying framework.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 352 REFERENCES
Greedy Layer-Wise Training of Deep Networks
TLDR
These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.
Scaling learning algorithms towards AI
TLDR
It is argued that deep architectures have the potential to generalize in non-local ways, i.e., beyond immediate neighbors, and that this is crucial in order to make progress on the kind of complex tasks required for artificial intelligence.
Exploring Strategies for Training Deep Neural Networks
TLDR
These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy helps the optimization by initializing weights in a region near a good local minimum, but also implicitly acts as a sort of regularization that brings better generalization and encourages internal distributed representations that are high-level abstractions of the input.
Representational Power of Restricted Boltzmann Machines and Deep Belief Networks
TLDR
This work proves that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions and suggests a new and less greedy criterion for training RBMs within DBNs.
The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training
TLDR
The experiments confirm and clarify the advantage of unsupervised pre- training, and empirically show the influence of pre-training with respect to architecture depth, model capacity, and number of training examples.
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
TLDR
The convolutional deep belief network is presented, a hierarchical generative model which scales to realistic image sizes and is translation-invariant and supports efficient bottom-up and top-down probabilistic inference.
On the quantitative analysis of deep belief networks
TLDR
It is shown that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and a novel AIS scheme for comparing RBM's with different architectures is presented.
Sparse Feature Learning for Deep Belief Networks
TLDR
This work proposes a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation, and describes a novel and efficient algorithm to learn sparse representations.
A Fast Learning Algorithm for Deep Belief Nets
TLDR
A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Deep Boltzmann Machines
TLDR
A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass.
...
1
2
3
4
5
...