Corpus ID: 9383489

Efficient Learning of Deep Boltzmann Machines

@inproceedings{Salakhutdinov2010EfficientLO,
  title={Efficient Learning of Deep Boltzmann Machines},
  author={Ruslan Salakhutdinov and H. Larochelle},
  booktitle={AISTATS},
  year={2010}
}
We present a new approximate inference algorithm for Deep Boltzmann Machines (DBM’s), a generative model with many layers of hidden variables. [...] Key Result Finally, we demonstrate that the DBM’s trained using the proposed approximate inference algorithm perform well compared to DBN’s and SVM’s on the MNIST handwritten digit, OCR English letters, and NORB visual object recognition tasks.Expand
An Efficient Learning Procedure for Deep Boltzmann Machines
TLDR
A new learning algorithm for Boltzmann machines that contain many layers of hidden variables is presented and results on the MNIST and NORB data sets are presented showing that deep BoltZmann machines learn very good generative models of handwritten digits and 3D objects. Expand
A Two-Stage Pretraining Algorithm for Deep Boltzmann Machines
TLDR
This paper shows empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm. Expand
Hyper-Parameter-Free Generative Modelling with Deep Boltzmann Trees
TLDR
It is shown that the conditional independence structure of any categorical Deep Boltzmann Machine contains a sub-tree that allows the consistent estimation of the full joint probability mass function of all visible units, and that the DBT is a theoretical sound alternative to likelihood-free generative models. Expand
How to Pretrain Deep Boltzmann Machines in Two Stages
TLDR
This paper shows empirically that the proposed method overcomes the difficulty in training DBMs from randomly initialized parameters and results in a better, or comparable, generative model when compared to the conventional pretraining algorithm. Expand
Soft-Deep Boltzmann Machines
TLDR
This paper proposes an approximate measure for the representational power of a BM regarding to the efficiency of a distributed representation and proposes an alternative BM architecture, which it is shown can more efficiently exploit the distributed representations in terms of the measure. Expand
Relationship between PreTraining and Maximum Likelihood Estimation in Deep Boltzmann Machines
TLDR
A pretraining algorithm, which is a layer-bylayer greedy learning algorithm, for a deep Boltzmann machine (DBM) is presented and it can be ensured that the pretraining improves the variational bound of the true log-likelihood function of the DBM. Expand
Learning Deep Generative Models with Short Run Inference Dynamics
TLDR
This paper proposes to use short run inference dynamics guided by the log-posterior, such as finite-step gradient descent algorithm initialized from the prior distribution of the latent variables, as an approximate sampler of the posterior distribution, where the step size of the gradient descent dynamics is optimized by minimizing the Kullback-Leibler divergence. Expand
Deep Learning using Restricted Boltzmann machines
-Restricted Boltzmann machines (RBM) are probabilistic graphical models which are represented as stochastic neural networks. Increase in computational capacity and development of faster learningExpand
A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines
TLDR
This work derives a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. Expand
Variational Probability Flow for Biologically Plausible Training of Deep Neural Networks
TLDR
It is shown that weight updates in VPF are local, depending only on the states and firing rates of the adjacent neurons, and, interestingly, if an asymmetric version of VPF exists, the weight updates directly explain experimental results in Spike-Timing-Dependent Plasticity (STDP). Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 27 REFERENCES
Deep Boltzmann Machines
TLDR
A new learning algorithm for Boltzmann machines that contain many layers of hidden variables that is made more efficient by using a layer-by-layer “pre-training” phase that allows variational inference to be initialized with a single bottomup pass. Expand
On the quantitative analysis of deep belief networks
TLDR
It is shown that Annealed Importance Sampling (AIS) can be used to efficiently estimate the partition function of an RBM, and a novel AIS scheme for comparing RBM's with different architectures is presented. Expand
A Fast Learning Algorithm for Deep Belief Nets
TLDR
A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. Expand
Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations
TLDR
The convolutional deep belief network is presented, a hierarchical generative model which scales to realistic image sizes and is translation-invariant and supports efficient bottom-up and top-down probabilistic inference. Expand
Learning Deep Architectures for AI
TLDR
The motivations and principles regarding learning algorithms for deep architectures, in particular those exploiting as building blocks unsupervised learning of single-layer modelssuch as Restricted Boltzmann Machines, used to construct deeper models such as Deep Belief Networks are discussed. Expand
Learning and Evaluating Boltzmann Machines
We provide a brief overview of the variational framework for obtaining deterministic approximations or upper bounds for the log-partition function. We also review some of the Monte Carlo basedExpand
3D Object Recognition with Deep Belief Nets
TLDR
A new type of top-level model for Deep Belief Nets is introduced, a third-order Boltzmann machine, trained using a hybrid algorithm that combines both generative and discriminative gradients that substantially outperforms shallow models such as SVMs. Expand
Efficient Learning of Sparse Representations with an Energy-Based Model
TLDR
A novel unsupervised method for learning sparse, overcomplete features using a linear encoder, and a linear decoder preceded by a sparsifying non-linearity that turns a code vector into a quasi-binary sparse code vector. Expand
Deep Learning using Robust Interdependent Codes
TLDR
A simple yet effective method to introduce inhibitory and excitatory interactions between units in the layers of a deep neural network classifier is investigated, and it is presented for the first time that lateral connections can significantly improve the classification performance of deep networks. Expand
Annealed importance sampling
  • R. Neal
  • Mathematics, Physics
  • Stat. Comput.
  • 2001
TLDR
It is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler, which can be seen as a generalization of a recently-proposed variant of sequential importance sampling. Expand
...
1
2
3
...