• Corpus ID: 5867279

Sparse Feature Learning for Deep Belief Networks

  title={Sparse Feature Learning for Deep Belief Networks},
  author={Marc'Aurelio Ranzato and Y-Lan Boureau and Yann LeCun},
Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation… 

Figures from this paper

Unsupervised feature learning via sparse hierarchical representations

This work describes how efficient sparse coding algorithms — which represent each input example using a small number of basis vectors — can be used to learn good low-level representations from unlabeled data, and shows that this gives feature representations that yield improved performance in many machine learning tasks.

A Novel Sparse Deep Belief Network for Unsupervised Feature Learning

A novel version of sparse deep belief network for unsupervised feature extraction that learns hierarchical representations which mimicks computations in the cortical hierarchy, and obtains more discriminative representation than PCA and several basic algorithms of deep belief networks.

Effective sparsity control in deep belief networks using normal regularization term

A new method is proposed that has different behavior according to deviation of the activation of the hidden units from a (low) fixed value and has a variance parameter that can control the force degree of sparseness.

Unsupervised feature learning using Markov deep belief network

A new deep learning model, named Markov DBN (MDBN), is proposed to address problems of DBN, which employs a new way for DBN to reduce computational burden and handle large images.

Deep Learning of Representations for Unsupervised and Transfer Learning

  • Yoshua Bengio
  • Computer Science
    ICML Unsupervised and Transfer Learning
  • 2012
Why unsupervised pre-training of representations can be useful, and how it can be exploited in the transfer learning scenario, where the authors care about predictions on examples that are not from the same distribution as the training distribution.

Sparse Deep Belief Net for Handwritten Digits Classification

Another version of Sparse Deep Belief Net is proposed which applies the differentiable sparse coding method to train the first level of the deep network, and then train the higher layers with RBM, which leads to state-of-the-art performance on the classification of handwritten digits.

Learning Feature Hierarchies for Object Recognition

This thesis proposes sparse-modeling algorithms as the foundation for unsupervised feature extraction systems, and proposes convolutional sparse coding algorithms that yield a richer set of dictionary elements, reduce the redundancy of the representation and improve recognition performance.

Unsupervised Learning Under Uncertainty

Two methods to address the problem of video prediction are introduced, first using a novel form of linearizing auto-encoder and latent variables, and secondly using Generative Adversarial Networks (GANs), to show how GANs can be seen as trainable loss functions to represent uncertainty, then how they can be used to disentangle factors of variation.

An Analysis of Single-Layer Networks in Unsupervised Feature Learning

The results show that large numbers of hidden nodes and dense feature extraction are critical to achieving high performance—so critical, in fact, that when these parameters are pushed to their limits, they achieve state-of-the-art performance on both CIFAR-10 and NORB using only a single layer of features.

Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives

Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, manifold learning, anddeep learning.



A Fast Learning Algorithm for Deep Belief Nets

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

Greedy Layer-Wise Training of Deep Networks

These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

A Unified Energy-Based Framework for Unsupervised Learning

A view of unsupervised learning is introduced that integrates probabilistic and nonprobabilistic methods for clustering, dimensionality reduction, and feature extraction in a unified framework and shows that a simple solution is to restrict the amount of information contained in codes that represent the data.

Energy-Based Models for Sparse Overcomplete Representations

A new way of extending independent components analysis (ICA) to overcomplete representations that defines features as deterministic (linear) functions of the inputs and assigns energies to the features through the Boltzmann distribution.

Learning Overcomplete Representations

It is shown that overcomplete bases can yield a better approximation of the underlying statistical distribution of the data and can thus lead to greater coding efficiency and provide a method for Bayesian reconstruction of signals in the presence of noise and for blind source separation when there are more sources than mixtures.

Scaling learning algorithms towards AI

It is argued that deep architectures have the potential to generalize in non-local ways, i.e., beyond immediate neighbors, and that this is crucial in order to make progress on the kind of complex tasks required for artificial intelligence.

Learning Sparse Overcomplete Codes for Images

A survey of algorithms that perform dictionary learning and sparse coding is presented and a modified version of the FOCUSS algorithm is presented that can find a non-negative sparse coding in some cases.

Learning the parts of objects by non-negative matrix factorization

An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.

Reducing the Dimensionality of Data with Neural Networks

This work describes an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.

Learning Sparse Multiscale Image Representations

A method for learning sparse multiscale image representations using a sparse prior distribution over the basis function coefficients, which includes a mixture of a Gaussian and a Dirac delta function, and thus encourages coefficients to have exact zero values.