• Publications
  • Influence
DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning
TLDR
We show that it is possible to design an accelerator with a high throughput, capable of performing 452 GOP/s (key NN operations such as synaptic weight multiplications and neurons outputs additions) in a small footprint of 3.02 mm2 and 485 mW. Expand
  • 1,047
  • 129
  • PDF
ShiDianNao: Shifting vision processing closer to the sensor
TLDR
In recent years, neural network accelerators have been shown to achieve both high energy efficiency and high performance for a broad application scope within the important category of recognition and mining applications. Expand
  • 577
  • 58
Cambricon-X: An accelerator for sparse neural networks
TLDR
We propose a novel accelerator, Cambricon-X, to exploit the sparsity and irregularity of NN models for increased efficiency. Expand
  • 369
  • 41
  • PDF
Cambricon: An Instruction Set Architecture for Neural Networks
TLDR
In this paper, we propose a novel domain-specific Instruction Set Architecture (ISA) for neural networks called Cambricon, which allows NN accelerators to flexibly support a broad range of different NN techniques. Expand
  • 154
  • 16
  • PDF
Cambricon-S: Addressing Irregularity in Sparse Neural Networks through A Cooperative Software/Hardware Approach
TLDR
We propose a software-based coarse-grained pruning technique to reduce the irregularity of sparse synapses drastically. Expand
  • 66
  • 10
Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators
TLDR
In recent years, inexact computing has been increasingly regarded as one of the most promising approaches for reducing energy consumption in many applications that can tolerate a degree of inaccuracy. Expand
  • 98
  • 7
  • PDF
Neuromorphic accelerators: A comparison between neuroscience and machine-learning approaches
TLDR
A vast array of devices, ranging from industrial robots to self-driven cars or smartphones, require increasingly sophisticated processing of real-world input data (image, voice, radio, …). Expand
  • 61
  • 5
  • PDF
A High-Throughput Neural Network Accelerator
TLDR
An accelerator architecture for large-scale neural networks that can perform 496 16-bit fixed-point operations in parallel every 1.02 ns, that is, 452 gop/s, in a 3.02mm2, 485-mw footprint. Expand
  • 26
  • 3
TDSNN: From Deep Neural Networks to Deep Spike Neural Networks with Temporal-Coding
TLDR
We propose a novel method to convert DNNs to temporal-coding SNNs, called TDSNN, which achieves 42% total operations reduction on average in large networks. Expand
  • 14
  • 2
Rubik: A Hierarchical Architecture for Efficient Graph Learning
  • X. Chen, Yuke Wang, +9 authors Yuan Xie
  • Computer Science
  • ArXiv
  • 26 September 2020
TLDR
Graph convolutional network (GCN) emerges as a promising direction to learn the inductive representation in graph data commonly used in widespread applications, such as E-commerce, social networks, and knowledge graphs. Expand
  • 2
  • 2
  • PDF
...
1
2
3
4
...