Reducing the dimensionality of data with neural networks.


High-dimensional data can be converted to low-dimensional codes by training a multilayer neural network with a small central layer to reconstruct high-dimensional input vectors. Gradient descent can be used for fine-tuning the weights in such "autoencoder" networks, but this works well only if the initial weights are close to a good solution. We describe an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data.

Extracted Key Phrases

Showing 1-10 of 10 references

Proceedings of Seventh International Conference on Document Analysis and Recognition

  • P Y Simard, D Steinkraus, J C Platt
  • 2003

Machine Learning

  • D V Decoste, B V Schoelkopf
  • 2002

Neural Computation

  • G E Hinton
  • 2002


  • S T Roweis, L K Saul
  • 2000

Proceedings of the IEEE

  • Y Lecun, L Bottou, Y Bengio, P Haffner
  • 1998

For the conjugate gradient fine-tuning, we used Carl Rasmussen's " minimize " code available at http

Matlab code for LLE is available at http

Matlab code for generating the images of curves is available at http

The 20 newsgroups dataset (called 20news-bydate.tar.gz) is available at http://people

Showing 1-10 of 2,516 extracted citations
Citations per Year

5,369 Citations

Semantic Scholar estimates that this publication has received between 4,972 and 5,796 citations based on the available data.

See our FAQ for additional information.