# Sparse Feature Learning for Deep Belief Networks

@inproceedings{Ranzato2007SparseFL, title={Sparse Feature Learning for Deep Belief Networks}, author={Marc'Aurelio Ranzato and Y-Lan Boureau and Yann LeCun}, booktitle={NIPS}, year={2007} }

Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (e.g. low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation…

## 869 Citations

### Unsupervised feature learning via sparse hierarchical representations

- Computer Science
- 2010

This work describes how efficient sparse coding algorithms — which represent each input example using a small number of basis vectors — can be used to learn good low-level representations from unlabeled data, and shows that this gives feature representations that yield improved performance in many machine learning tasks.

### A Novel Sparse Deep Belief Network for Unsupervised Feature Learning

- Computer Science
- 2012

A novel version of sparse deep belief network for unsupervised feature extraction that learns hierarchical representations which mimicks computations in the cortical hierarchy, and obtains more discriminative representation than PCA and several basic algorithms of deep belief networks.

### Effective sparsity control in deep belief networks using normal regularization term

- Computer ScienceKnowledge and Information Systems
- 2017

A new method is proposed that has different behavior according to deviation of the activation of the hidden units from a (low) fixed value and has a variance parameter that can control the force degree of sparseness.

### Unsupervised feature learning using Markov deep belief network

- Computer Science2013 IEEE International Conference on Image Processing
- 2013

A new deep learning model, named Markov DBN (MDBN), is proposed to address problems of DBN, which employs a new way for DBN to reduce computational burden and handle large images.

### Deep Learning of Representations for Unsupervised and Transfer Learning

- Computer ScienceICML Unsupervised and Transfer Learning
- 2012

Why unsupervised pre-training of representations can be useful, and how it can be exploited in the transfer learning scenario, where the authors care about predictions on examples that are not from the same distribution as the training distribution.

### Sparse Deep Belief Net for Handwritten Digits Classification

- Computer ScienceAICI
- 2010

Another version of Sparse Deep Belief Net is proposed which applies the differentiable sparse coding method to train the first level of the deep network, and then train the higher layers with RBM, which leads to state-of-the-art performance on the classification of handwritten digits.

### Learning Feature Hierarchies for Object Recognition

- Computer Science
- 2010

This thesis proposes sparse-modeling algorithms as the foundation for unsupervised feature extraction systems, and proposes convolutional sparse coding algorithms that yield a richer set of dictionary elements, reduce the redundancy of the representation and improve recognition performance.

### Unsupervised Learning Under Uncertainty

- Computer Science
- 2017

Two methods to address the problem of video prediction are introduced, first using a novel form of linearizing auto-encoder and latent variables, and secondly using Generative Adversarial Networks (GANs), to show how GANs can be seen as trainable loss functions to represent uncertainty, then how they can be used to disentangle factors of variation.

### An Analysis of Single-Layer Networks in Unsupervised Feature Learning

- Computer ScienceAISTATS
- 2011

The results show that large numbers of hidden nodes and dense feature extraction are critical to achieving high performance—so critical, in fact, that when these parameters are pushed to their limits, they achieve state-of-the-art performance on both CIFAR-10 and NORB using only a single layer of features.

### Unsupervised Feature Learning and Deep Learning: A Review and New Perspectives

- Computer ScienceArXiv
- 2012

Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, manifold learning, anddeep learning.

## References

SHOWING 1-10 OF 26 REFERENCES

### A Fast Learning Algorithm for Deep Belief Nets

- Computer ScienceNeural Computation
- 2006

A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.

### Greedy Layer-Wise Training of Deep Networks

- Computer ScienceNIPS
- 2006

These experiments confirm the hypothesis that the greedy layer-wise unsupervised training strategy mostly helps the optimization, by initializing weights in a region near a good local minimum, giving rise to internal distributed representations that are high-level abstractions of the input, bringing better generalization.

### A Unified Energy-Based Framework for Unsupervised Learning

- Computer ScienceAISTATS
- 2007

A view of unsupervised learning is introduced that integrates probabilistic and nonprobabilistic methods for clustering, dimensionality reduction, and feature extraction in a unified framework and shows that a simple solution is to restrict the amount of information contained in codes that represent the data.

### Energy-Based Models for Sparse Overcomplete Representations

- Computer ScienceJ. Mach. Learn. Res.
- 2003

A new way of extending independent components analysis (ICA) to overcomplete representations that defines features as deterministic (linear) functions of the inputs and assigns energies to the features through the Boltzmann distribution.

### Learning Overcomplete Representations

- Computer ScienceNeural Computation
- 2000

It is shown that overcomplete bases can yield a better approximation of the underlying statistical distribution of the data and can thus lead to greater coding efficiency and provide a method for Bayesian reconstruction of signals in the presence of noise and for blind source separation when there are more sources than mixtures.

### Scaling learning algorithms towards AI

- Computer Science
- 2007

It is argued that deep architectures have the potential to generalize in non-local ways, i.e., beyond immediate neighbors, and that this is crucial in order to make progress on the kind of complex tasks required for artificial intelligence.

### Learning Sparse Overcomplete Codes for Images

- Computer ScienceJ. VLSI Signal Process.
- 2007

A survey of algorithms that perform dictionary learning and sparse coding is presented and a modified version of the FOCUSS algorithm is presented that can find a non-negative sparse coding in some cases.

### Learning the parts of objects by non-negative matrix factorization

- Computer ScienceNature
- 1999

An algorithm for non-negative matrix factorization is demonstrated that is able to learn parts of faces and semantic features of text and is in contrast to other methods that learn holistic, not parts-based, representations.

### Reducing the Dimensionality of Data with Neural Networks

- Computer ScienceScience
- 2006

This work describes an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.

### Learning Sparse Multiscale Image Representations

- Computer ScienceNIPS
- 2002

A method for learning sparse multiscale image representations using a sparse prior distribution over the basis function coefficients, which includes a mixture of a Gaussian and a Dirac delta function, and thus encourages coefficients to have exact zero values.