Learning Deep Generative Models

  title={Learning Deep Generative Models},
  author={Ruslan Salakhutdinov},
Building intelligent systems that are capable of extracting high-level representations from high-dimensional sensory data lies at the core of solving many AI related tasks, including object recognition, speech perception, and language understanding. Theoretical and biological arguments strongly suggest that building such systems requires models with deep architectures that involve many layers of nonlinear processing. The aim of the thesis is to demonstrate that deep generative models that… 

Deep learning systems as complex networks

This article proposes to study deep belief networks using techniques commonly employed in the study of complex networks, in order to gain some insights into the structural and functional properties of the computational graph resulting from the learning process.

Inference in Deep Networks in High Dimensions

  • A. FletcherS. Rangan
  • Computer Science
    2018 IEEE International Symposium on Information Theory (ISIT)
  • 2018
The main contribution shows that the mean-squared error (MSE) of ML-VAMP can be exactly predicted in a certain large system limit and matches the Bayes optimal value recently postulated by Reeves when certain fixed point equations have unique solutions.

Predictive learning as a network mechanism for extracting low-dimensional latent space representations

This work investigates the hypothesis that a means for generating representations with easily accessed low-dimensional latent structure is through learning to predict observations about the world, and investigates whether and when network mechanisms for sensory prediction coincide with those for extracting the underlying latent variables.

Learning Deep and Wide: A Spectral Method for Learning Deep Networks

  • L. ShaoDi WuXuelong Li
  • Computer Science
    IEEE Transactions on Neural Networks and Learning Systems
  • 2014
This work proposes the multispectral neural networks (MSNN) to learn features from multicolumn deep neural networks and embed the penultimate hierarchical discriminative manifolds into a compact representation.

Signatures and mechanisms of low-dimensional neural predictive manifolds

This work investigates the hypothesis that the hippocampus performs its role in sequential planning by organizing semantically related episodes in a relational network from learning a predictive representation of the world, and shows that network dynamics exhibit low dimensional but non-linearly transformed representations of sensory input statistics.

Generative learning for deep networks

It is shown that forward computation in DNNs with logistic sigmoid activations corresponds to a simplified approximate Bayesian inference in a directed probabilistic multi-layer model, and proposed that in order for the recognition and generation networks to be more consistent with the joint model of the data, weights of the Recognition and generator network should be related by transposition.

An Overview of Deep Generative Models

Three important deep generative models including DBNs, deep autoencoder, and deep Boltzmann machine are reviewed and some successful applications of deep generatives models in image processing, speech recognition and information retrieval are introduced and analysed.

Inference With Deep Generative Priors in High Dimensions

This paper shows that the performance of ML-VAMP can be exactly predicted in a certain high-dimensional random limit, and provides a computationally efficient method for multi-layer inference with an exact performance characterization and testable conditions for optimality in the large-system limit.

Generative mixture of networks

A generative model based on training deep architectures that consists of K networks that are trained together to learn the underlying distribution of a given data set, called Mixture of Networks, has high capability in characterizing complicated data distributions as well as clustering data.



Learning Deep Architectures for AI

The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed.

Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations

The convolutional deep belief network is presented, a hierarchical generative model which scales to realistic image sizes and is translation-invariant and supports efficient bottom-up and top-down probabilistic inference.

Sparse Feature Learning for Deep Belief Networks

This work proposes a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation, and describes a novel and efficient algorithm to learn sparse representations.

Scaling learning algorithms towards AI

It is argued that deep architectures have the potential to generalize in non-local ways, i.e., beyond immediate neighbors, and that this is crucial in order to make progress on the kind of complex tasks required for artificial intelligence.

Greedy Layer-Wise Training of Deep Networks

These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

A Fast Learning Algorithm for Deep Belief Nets

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks

This paper presents a framework for training hierarchical feed-forward models for visual recognition, using transfer learning from pseudo tasks, and shows that these pseudo tasks induce an informative inverse-Wishart prior on the functional behavior of the network, offering an effective way to incorporate useful prior knowledge into the network training.

On the quantitative analysis of deep belief networks

It is shown that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and a novel AIS scheme for comparing RBM's with different architectures is presented.

Deep Boltzmann Machines

A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass.

Representational Power of Restricted Boltzmann Machines and Deep Belief Networks

This work proves that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions and suggests a new and less greedy criterion for training RBMs within DBNs.