• Computer Science
  • Published in ICLR 2016

DSD: Dense-Sparse-Dense Training for Deep Neural Networks

@inproceedings{Han2016DSDDT,
  title={DSD: Dense-Sparse-Dense Training for Deep Neural Networks},
  author={Song Han and Jeff Pool and Sharan Narang and Huizi Mao and Enhao Gong and Shijian Tang and Erich Elsen and Peter Vajda and Manohar Paluri and John Tran and Bryan Catanzaro and William J. Dally},
  booktitle={ICLR},
  year={2016}
}
Modern deep neural networks have a large number of parameters, making them very hard to train. We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance. In the first D (Dense) step, we train a dense network to learn connection weights and importance. In the S (Sparse) step, we regularize the network by pruning the unimportant connections with small weights and retraining the network given the sparsity constraint. In… CONTINUE READING

Citations

Publications citing this paper.
SHOWING 1-10 OF 42 CITATIONS

Hierarchical Attention-based Fuzzy Neural Network for Subject Classification of Power Customer Service Work Orders

  • 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE)
  • 2019
VIEW 6 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

DropPruning for Model Compression

  • ArXiv
  • 2018
VIEW 23 EXCERPTS
CITES METHODS & BACKGROUND
HIGHLY INFLUENCED

Full deep neural network training on a pruned weight budget

VIEW 4 EXCERPTS
CITES METHODS, RESULTS & BACKGROUND
HIGHLY INFLUENCED

CNNs are Globally Optimal Given Multi-Layer Support

VIEW 4 EXCERPTS
CITES METHODS
HIGHLY INFLUENCED

A None-Sparse Inference Accelerator that Distills and Reuses the Computation Redundancy in CNNs

  • 2019 56th ACM/IEEE Design Automation Conference (DAC)
  • 2019
VIEW 3 EXCERPTS
CITES BACKGROUND & METHODS

References

Publications referenced by this paper.