Large Scale Distributed Deep Networks

  title={Large Scale Distributed Deep Networks},
  author={Jeffrey Dean and Gregory S. Corrado and Rajat Monga and Kai Chen and Matthieu Devin and Quoc V. Le and Mark Z. Mao and Marc'Aurelio Ranzato and Andrew W. Senior and Paul A. Tucker and Ke Yang and Andrew Y. Ng},
Recent work in unsupervised feature learning and deep learning has shown that being able to train large models can dramatically improve performance. In this paper, we consider the problem of training a deep network with billions of parameters using tens of thousands of CPU cores. We have developed a software framework called DistBelief that can utilize computing clusters with thousands of machines to train large models. Within this framework, we have developed two algorithms for large-scale… CONTINUE READING
Highly Influential
This paper has highly influenced 145 other papers. REVIEW HIGHLY INFLUENTIAL CITATIONS
Highly Cited
This paper has 1,814 citations. REVIEW CITATIONS


Publications citing this paper.
Showing 1-10 of 1,153 extracted citations

1,815 Citations

Citations per Year
Semantic Scholar estimates that this publication has 1,815 citations based on the available data.

See our FAQ for additional information.


Publications referenced by this paper.
Showing 1-10 of 29 references

Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition

IEEE Transactions on Audio, Speech, and Language Processing • 2012
View 12 Excerpts
Highly Influenced

Building high-level features using large scale unsupervised learning

2013 IEEE International Conference on Acoustics, Speech and Signal Processing • 2012
View 2 Excerpts

Efficient BackProp

Neural Networks: Tricks of the Trade • 2012
View 1 Excerpt

Multi-column deep neural networks for image classification

2012 IEEE Conference on Computer Vision and Pattern Recognition • 2012
View 3 Excerpts

Scalable stacking and learning for building deep architectures

2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) • 2012
View 1 Excerpt

Similar Papers

Loading similar papers…