• Publications
  • Influence
Deep Learning with Limited Numerical Precision
TLDR
We study the effect of limited precision data representation and computation on neural network training with a special focus on the rounding mode adopted while performing operations on fixed-point numbers. Expand
  • 1,124
  • 68
  • PDF
PACT: Parameterized Clipping Activation for Quantized Neural Networks
TLDR
This paper proposes a novel quantization scheme for activations during training - that enables neural networks to work well with ultra low precision weights and activations without any significant accuracy degradation. Expand
  • 212
  • 52
  • PDF
Phase change memory technology
The authors survey the current state of phase change memory (PCM), a nonvolatile solid-state memory technology built around the large electrical contrast between the highly resistive amorphous andExpand
  • 695
  • 35
  • PDF
Overview of candidate device technologies for storage-class memory
TLDR
Storage-class memory (SCM) combines the benefits of a solid-state memory, such as high performance and robustness, with the archival capabilities and low cost of conventional hard-disk magnetic storage. Expand
  • 563
  • 29
  • PDF
Training Deep Neural Networks with 8-bit Floating Point Numbers
TLDR
The state-of-the-art hardware platforms for training deep neural networks are moving from traditional single precision (32-bit) computations towards 16 bits of precision - in large part due to the high energy efficiency and smaller bit storage associated with using reduced-precision representations. Expand
  • 122
  • 15
  • PDF
Nanoscale electronic synapses using phase change devices
TLDR
The memory capacity, computational power, communication bandwidth, energy consumption, and physical size of the brain all tend to scale with the number of synapses, which outnumber neurons by a factor of 10,000. Expand
  • 116
  • 9
Activation and diffusion studies of ion-implanted p and n dopants in germanium
We have demonstrated symmetrically high levels of electrical activation of both p- and n-type dopants in germanium. Rapid thermal annealing of various commonly implanted dopant species were performedExpand
  • 247
  • 7
Bridging the Accuracy Gap for 2-bit Quantized Neural Networks (QNN)
TLDR
This paper proposes novel techniques that target weight and activation quantizations separately resulting in an overall quantized neural network that achieves state-of-the-art classification accuracy across a range of popular models and datasets. Expand
  • 37
  • 7
  • PDF
AdaComp : Adaptive Residual Gradient Compression for Data-Parallel Distributed Training
TLDR
We demonstrate end-to-end compression rates of ~200X for fully connected and recurrent layers, and ~40X for convolutional layers of a neural net. Expand
  • 56
  • 5
  • PDF
A Scalable Multi- TeraOPS Deep Learning Processor Core for AI Trainina and Inference
TLDR
A multi-TOPS AI core is presented for acceleration of deep learning training and inference in systems from edge devices to data centers. Expand
  • 51
  • 3