Learning Deep Architectures for AI

@article{Bengio2007LearningDA,
  title={Learning Deep Architectures for AI},
  author={Yoshua Bengio},
  journal={Found. Trends Mach. Learn.},
  year={2007},
  volume={2},
  pages={1-127}
}
  • Yoshua Bengio
  • Published 2007
  • Computer Science
  • Found. Trends Mach. Learn.
Theoretical results strongly suggest that in order to learn the kind of complicated functions that can represent high-level abstractions (e.g. in vision, language, and other AI-level tasks), one needs deep architectures. Deep architectures are composed of multiple levels of non-linear operations, such as in neural nets with many hidden layers or in complicated propositional formulae re-using many sub-formulae. Searching the parameter space of deep architectures is a difficult optimization task… 

Category: Learning Algorithms Deep Woods

The principle of training a deep architecture by greedy layer-wise unsupervised training has been shown to be successful for deep connectionist architectures and this work attempts to exploit this principle to develop new deep architectures based on deterministic or stochastic decision trees.

Learning deep generative models

The aim of the thesis is to demonstrate that deep generative models that contain many layers of latent variables and millions of parameters can be learned efficiently, and that the learned high-level feature representations can be successfully applied in a wide spectrum of application domains, including visual object recognition, information retrieval, and classification and regression tasks.

Reinforcement Learning with Deep Architectures

Some of the issues that arise when trying to integrate ideas from deep learning into the reinforcement learning framework are considered, a class of algorithms which are referred to as Iterative Feature Extracting Learning Agents (IFELAs) are presented, and their performance on the inverted pendulum problem is compared to more standard approaches.

Understanding Representations Learned in Deep Architectures

It is shown that consistent filter-like interpretation is possible and simple to accomp lish at the unit level and it is hoped that such techniques will allow researchers in deep architectures to unde rstand more of how and why deep architectures work.

Exploring Strategies for Training Deep Neural Networks

These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy helps the optimization by initializing weights in a region near a good local minimum, but also implicitly acts as a sort of regularization that brings better generalization and encourages internal distributed representations that are high-level abstractions of the input.

Deep Representation Learning with Genetic Programming

The pitfalls of developing representation learning systems with GP that do not require experts’ knowledge are explored, and a solution based on layered GP structures, similar to those found in DNN are proposed.

Variational approach to unsupervised learning

  • S. Shah
  • Computer Science
    Journal of Physics Communications
  • 2019
It is demonstrated how one can arrive at convolutional deep belief networks as potential solution to unsupervised learning problems without making assumptions about the underlying framework.

Deep Learning of Representations

This chapter reviews the main motivations and ideas behind deep learning algorithms and their representation-learning components, as well as recent results, and proposes a vision of challenges and hopes on the road ahead, focusing on the questions of invariance and disentangling.

Restricted Boltzmann Machines and Deep Belief Networks on multi-core processors

This paper presents an approach that relies mainly on three kernels for implementing both the Restricted Boltzmann Machines (RBM) and Deep Belief Networks (DBN) algorithms, and uses a step adaptive learning rate procedure which accelerates convergence.
...

References

SHOWING 1-10 OF 242 REFERENCES

Scaling learning algorithms towards AI

It is argued that deep architectures have the potential to generalize in non-local ways, i.e., beyond immediate neighbors, and that this is crucial in order to make progress on the kind of complex tasks required for artificial intelligence.

Exploring Strategies for Training Deep Neural Networks

These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy helps the optimization by initializing weights in a region near a good local minimum, but also implicitly acts as a sort of regularization that brings better generalization and encourages internal distributed representations that are high-level abstractions of the input.

Representational Power of Restricted Boltzmann Machines and Deep Belief Networks

This work proves that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions and suggests a new and less greedy criterion for training RBMs within DBNs.

Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

The convolutional deep belief network is presented, a hierarchical generative model which scales to realistic image sizes and is translation-invariant and supports efficient bottom-up and top-down probabilistic inference.

The Difficulty of Training Deep Architectures and the Effect of Unsupervised Pre-Training

The experiments confirm and clarify the advantage of unsupervised pre- training, and empirically show the influence of pre-training with respect to architecture depth, model capacity, and number of training examples.

A Fast Learning Algorithm for Deep Belief Nets

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

On the quantitative analysis of deep belief networks

It is shown that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and a novel AIS scheme for comparing RBM's with different architectures is presented.

Sparse Feature Learning for Deep Belief Networks

This work proposes a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation, and describes a novel and efficient algorithm to learn sparse representations.

Deep Boltzmann Machines

A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass.

An empirical evaluation of deep architectures on problems with many factors of variation

A series of experiments indicate that these models with deep architectures show promise in solving harder learning problems that exhibit many factors of variation.
...