Learning Deep Generative Models
@inproceedings{Salakhutdinov2009LearningDG, title={Learning Deep Generative Models}, author={Ruslan Salakhutdinov}, year={2009} }
Building intelligent systems that are capable of extracting high-level representations from high-dimensional sensory data lies at the core of solving many AI related tasks, including object recognition, speech perception, and language understanding. Theoretical and biological arguments strongly suggest that building such systems requires models with deep architectures that involve many layers of nonlinear processing.
The aim of the thesis is to demonstrate that deep generative models that…
Figures and Tables from this paper
figure 2.1 figure 2.2 figure 2.3 figure 2.4 figure 2.5 figure 2.6 figure 3.1 table 3.1 figure 3.10 figure 3.11 figure 3.12 figure 3.13 figure 3.14 figure 3.15 figure 3.16 figure 3.17 figure 3.18 figure 3.18 figure 3.2 table 3.2 figure 3.3 table 3.3 figure 3.4 figure 3.5 figure 3.6 figure 3.7 figure 3.8 figure 3.9 figure 4.1 figure 4.2 table 4.2 figure 4.3 table 4.3 figure 5.1 table 5.1 figure 5.2 table 5.2 figure 5.3 figure 5.4 figure 5.5 figure 5.6 figure 5.7 figure A.1 figure A.2 figure A.3
347 Citations
Deep learning systems as complex networks
- Computer ScienceJournal of Complex Networks
- 2019
This article proposes to study deep belief networks using techniques commonly employed in the study of complex networks, in order to gain some insights into the structural and functional properties of the computational graph resulting from the learning process.
Inference in Deep Networks in High Dimensions
- Computer Science2018 IEEE International Symposium on Information Theory (ISIT)
- 2018
The main contribution shows that the mean-squared error (MSE) of ML-VAMP can be exactly predicted in a certain large system limit and matches the Bayes optimal value recently postulated by Reeves when certain fixed point equations have unique solutions.
Predictive learning as a network mechanism for extracting low-dimensional latent space representations
- Computer Science, PsychologyNature communications
- 2021
This work investigates the hypothesis that a means for generating representations with easily accessed low-dimensional latent structure is through learning to predict observations about the world, and investigates whether and when network mechanisms for sensory prediction coincide with those for extracting the underlying latent variables.
Learning Deep and Wide: A Spectral Method for Learning Deep Networks
- Computer ScienceIEEE Transactions on Neural Networks and Learning Systems
- 2014
This work proposes the multispectral neural networks (MSNN) to learn features from multicolumn deep neural networks and embed the penultimate hierarchical discriminative manifolds into a compact representation.
Signatures and mechanisms of low-dimensional neural predictive manifolds
- Computer Science, BiologybioRxiv
- 2018
This work investigates the hypothesis that the hippocampus performs its role in sequential planning by organizing semantically related episodes in a relational network from learning a predictive representation of the world, and shows that network dynamics exhibit low dimensional but non-linearly transformed representations of sensory input statistics.
Generative learning for deep networks
- Computer ScienceArXiv
- 2017
It is shown that forward computation in DNNs with logistic sigmoid activations corresponds to a simplified approximate Bayesian inference in a directed probabilistic multi-layer model, and proposed that in order for the recognition and generation networks to be more consistent with the joint model of the data, weights of the Recognition and generator network should be related by transposition.
An Overview of Deep Generative Models
- Computer Science
- 2015
Three important deep generative models including DBNs, deep autoencoder, and deep Boltzmann machine are reviewed and some successful applications of deep generatives models in image processing, speech recognition and information retrieval are introduced and analysed.
Novel deep generative simultaneous recurrent model for efficient representation learning
- Computer ScienceNeural Networks
- 2018
Inference With Deep Generative Priors in High Dimensions
- Computer ScienceIEEE Journal on Selected Areas in Information Theory
- 2020
This paper shows that the performance of ML-VAMP can be exactly predicted in a certain high-dimensional random limit, and provides a computationally efficient method for multi-layer inference with an exact performance characterization and testable conditions for optimality in the large-system limit.
Generative mixture of networks
- Computer Science2017 International Joint Conference on Neural Networks (IJCNN)
- 2017
A generative model based on training deep architectures that consists of K networks that are trained together to learn the underlying distribution of a given data set, called Mixture of Networks, has high capability in characterizing complicated data distributions as well as clustering data.
References
SHOWING 1-10 OF 123 REFERENCES
Learning Deep Architectures for AI
- Computer ScienceFound. Trends Mach. Learn.
- 2007
The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed.
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
- Computer ScienceICML '09
- 2009
The convolutional deep belief network is presented, a hierarchical generative model which scales to realistic image sizes and is translation-invariant and supports efficient bottom-up and top-down probabilistic inference.
Sparse Feature Learning for Deep Belief Networks
- Computer ScienceNIPS
- 2007
This work proposes a simple criterion to compare and select different unsupervised machines based on the trade-off between the reconstruction error and the information content of the representation, and describes a novel and efficient algorithm to learn sparse representations.
Scaling learning algorithms towards AI
- Computer Science
- 2007
It is argued that deep architectures have the potential to generalize in non-local ways, i.e., beyond immediate neighbors, and that this is crucial in order to make progress on the kind of complex tasks required for artificial intelligence.
Greedy Layer-Wise Training of Deep Networks
- Computer ScienceNIPS
- 2006
These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.
A Fast Learning Algorithm for Deep Belief Nets
- Computer ScienceNeural Computation
- 2006
A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Training Hierarchical Feed-Forward Visual Recognition Models Using Transfer Learning from Pseudo-Tasks
- Computer ScienceECCV
- 2008
This paper presents a framework for training hierarchical feed-forward models for visual recognition, using transfer learning from pseudo tasks, and shows that these pseudo tasks induce an informative inverse-Wishart prior on the functional behavior of the network, offering an effective way to incorporate useful prior knowledge into the network training.
On the quantitative analysis of deep belief networks
- Computer ScienceICML '08
- 2008
It is shown that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and a novel AIS scheme for comparing RBM's with different architectures is presented.
Deep Boltzmann Machines
- Computer ScienceAISTATS
- 2009
A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass.
Representational Power of Restricted Boltzmann Machines and Deep Belief Networks
- Computer ScienceNeural Computation
- 2008
This work proves that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions and suggests a new and less greedy criterion for training RBMs within DBNs.