Rectified Linear Units Improve Restricted Boltzmann Machines
@inproceedings{Nair2010RectifiedLU, title={Rectified Linear Units Improve Restricted Boltzmann Machines}, author={Vinod Nair and Geoffrey E. Hinton}, booktitle={ICML}, year={2010} }
Restricted Boltzmann machines were developed using binary stochastic hidden units. These can be generalized by replacing each binary unit by an infinite number of copies that all have the same weights but have progressively more negative biases. The learning and inference rules for these "Stepped Sigmoid Units" are unchanged. They can be approximated efficiently by noisy, rectified linear units. Compared with binary units, these units learn features that are better for object recognition on the…
Figures and Tables from this paper
12,831 Citations
Restricted Boltzmann Machine with Adaptive Local Hidden Units
- Computer ScienceICONIP
- 2013
Experiments on hand-written digits and human faces show that the proposed variant of RBM with adaptive local hidden units ALRBM has the ability to learn region-based local feature representations adapting to the content of the images automatically.
On rectified linear units for speech processing
- Computer Science2013 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2013
This work shows that it can improve generalization and make training of deep networks faster and simpler by substituting the logistic units with rectified linear units.
Restricted Boltzmann Machines With Gaussian Visible Units Guided by Pairwise Constraints
- Computer ScienceIEEE Transactions on Cybernetics
- 2019
This paper proposes pairwise constraints (PCs) RBM with Gaussian visible units (pcGRBM) model, in which the learning procedure is guided by PCs and the process of encoding is conducted under these guidances, to enhance the expression ability of traditional RBMs.
Sparse hidden units activation in Restricted Boltzmann Machine
- Computer ScienceICSEng
- 2014
A new regularization term for sparse hidden units activation in the context of Restricted Boltzmann Machine (RBM) is studied, based on the symmetric Kullback-Leibler divergence applied to compare the actual and the desired distribution over the active hidden units.
Deep neural networks with Elastic Rectified Linear Units for object recognition
- Computer ScienceNeurocomputing
- 2018
A Spike and Slab Restricted Boltzmann Machine
- Computer ScienceAISTATS
- 2011
We introduce the spike and slab Restricted Boltzmann Machine, characterized by having both a real-valued vector, the slab, and a binary variable, the spike, associated with each unit in the hidden…
An Efficient Learning Procedure for Deep Boltzmann Machines
- Computer ScienceNeural Computation
- 2012
A new learning algorithm for Boltzmann machines that contain many layers of hidden variables is presented and results on the MNIST and NORB data sets are presented showing that deep BoltZmann machines learn very good generative models of handwritten digits and 3D objects.
Reducing Parameter Space for Neural Network Training
- Computer ScienceTheoretical and Applied Mechanics Letters
- 2020
On better training the infinite restricted Boltzmann machines
- Computer ScienceMachine Learning
- 2018
Experimental results indicate that the proposed training strategy can greatly accelerate learning and enhance generalization ability of iRBMs.
Phone recognition with deep sparse rectifier neural networks
- Computer Science2013 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2013
It is shown that a deep architecture of rectifier neurons can attain the same recognition accuracy as deep neural networks, but without the need for pre-training.
References
SHOWING 1-10 OF 21 REFERENCES
Rate-coded Restricted Boltzmann Machines for Face Recognition
- Computer ScienceNIPS
- 2000
We describe a neurally-inspired, unsupervised learning algorithm that builds a non-linear generative model for pairs of face images from the same individual. Individuals are then recognized by…
Implicit Mixtures of Restricted Boltzmann Machines
- Computer ScienceNIPS
- 2008
Results for the MNIST and NORB datasets are presented showing that the implicit mixture of RBMs learns clusters that reflect the class structure in the data.
A Fast Learning Algorithm for Deep Belief Nets
- Computer ScienceNeural Computation
- 2006
A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
A Hierarchical Community of Experts
- Computer ScienceLearning in Graphical Models
- 1998
It is shown that Gibbs sampling can be used to learn the parameters of the linear and binary units even when the sampling is so brief that the Markov chain is far from equilibrium.
Phone recognition using Restricted Boltzmann Machines
- Computer Science2010 IEEE International Conference on Acoustics, Speech and Signal Processing
- 2010
Conditional Restricted Boltzmann Machines (CRBMs) have recently proved to be very effective for modeling motion capture sequences and this paper investigates the application of this more powerful type of generative model to acoustic modeling.
Unsupervised Learning of Distributions of Binary Vectors Using 2-Layer Networks
- Computer ScienceNIPS
- 1991
It is shown that arbitrary distributions of binary vectors can be approximated by the combination model and shown how the weight vectors in the model can be interpreted as high order correlation patterns among the input bits, and how the combination machine can be used as a mechanism for detecting these patterns.
Diffusion Networks, Products of Experts, and Factor Analysis
- Computer Science
- 2001
It is shown that when the unit activation functions are linear, this PoE architecture is equivalent to a factor analyzer, which suggests novel non-linear generalizations of factor analysis and independent component analysis that could be implemented using interactive neural circuitry.
What is the best multi-stage architecture for object recognition?
- Computer Science2009 IEEE 12th International Conference on Computer Vision
- 2009
It is shown that using non-linearities that include rectification and local contrast normalization is the single most important ingredient for good accuracy on object recognition benchmarks and that two stages of feature extraction yield better accuracy than one.
Reducing the Dimensionality of Data with Neural Networks
- Computer ScienceScience
- 2006
This work describes an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.
Learning methods for generic object recognition with invariance to pose and lighting
- Computer ScienceProceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004.
- 2004
A real-time version of the system was implemented that can detect and classify objects in natural scenes at around 10 frames per second and proved impractical, while convolutional nets yielded 16/7% error.