Always-On, Sub-300-nW, Event-Driven Spiking Neural Network based on Spike-Driven Clock-Generation and Clock- and Power-Gating for an Ultra-Low-Power Intelligent Device

  title={Always-On, Sub-300-nW, Event-Driven Spiking Neural Network based on Spike-Driven Clock-Generation and Clock- and Power-Gating for an Ultra-Low-Power Intelligent Device},
  author={Dewei Wang and Pavan Kumar Chundi and Sung Justin Kim and Minhao Yang and Jo{\~a}o Pedro Cerqueira and Joonsung Kang and Seungchul Jung and Sangjoon Kim and Mingoo Seok},
  journal={2020 IEEE Asian Solid-State Circuits Conference (A-SSCC)},
Always-on artificial intelligent (AI) functions such as keyword spotting (KWS) and visual wake-up tend to dominate total power consumption in ultra-low power devices [1]. A key observation is that the signals to an always-on function are sparse in time, which a spiking neural network (SNN) classifier can leverage for power savings, because the switching activity and power consumption of SNNs tend to scale with spike rate. Toward this goal, we present a novel SNN classifier architecture for… 

Figures and Tables from this paper

Always-On Sub-Microwatt Spiking Neural Network Based on Spike-Driven Clock- and Power-Gating for an Ultra-Low-Power Intelligent Device

A novel spiking neural network (SNN) classifier architecture for enabling always-on artificial intelligent functions, such as keyword spotting and visual wake-up, in ultra-low-power internet-of-things (IoT) devices by employing event-driven architecture to obtain very low static power dissipation.

IMPULSE: A 65-nm Digital Compute-in-Memory Macro With Fused Weights and Membrane Potential for Spike-Based Sequential Learning Tasks

A 10T-SRAM compute-in-memory (CIM) macro, specifically designed for state-of-the-art SNN inference, and staggered data mapping and reconfigurable peripherals for handling different bit precision requirements of SNN functionalities.

Spiking Neural Network Integrated Circuits: A Review of Trends and Future Directions

The rapid growth of deep learning, spurred by its successes in various fields ranging from face recognition [1] to game playing [2], has also triggered a growing interest in the design of specialized

A Background-Noise and Process-Variation-Tolerant 109nW Acoustic Feature Extractor Based on Spike-Domain Divisive-Energy Normalization for an Always-On Keyword Spotting Device

By adopting a normalized acoustic feature extractor chip (NAFE) in 65nm, the NAFE can take an acoustic signal from a microphone and produce spike-rate coded features and is paired with a spiking neural network (SNN) classifier chip, creating the end-to-end KWS system.

Hardware Accelerator and Neural Network Co-Optimization for Ultra-Low-Power Audio Processing Devices

HANNAH (Hard- ware Accelerator and Neural Network seArcH), a framework for automated and combined hardware/software co-design of deep neural networks and hardware accelerators for resource and power-constrained edge devices, is presented.

Hardware Aware Training for Efficient Keyword Spotting on General Purpose and Specialized Hardware

This work uses hardware aware training (HAT) to build new KWS neural networks based on the Legendre Memory Unit (LMU) that achieve state-of-the-art (SotA) accuracy and low parameter counts, and characterize the power requirements of custom designed accelerator hardware.

A 23 μW Keyword Spotting IC with Ring-Oscillator-Based Time-Domain Feature Extraction

The first keyword spotting IC which uses a ring-oscillator-based time-domain processing technique for its analog feature extractor (FEx) and offers a better technology scalability compared to conventional voltage-domain designs is presented.

Hardware Acceleration for Embedded Keyword Spotting: Tutorial and Survey

This article extensively survey the different approaches taken by the recent state-of-the-art SotA at the algorithmic, architectural, and circuit level to enable KWS tasks in edge, devices to explore and guide the reader through the design of KWS systems.



7.6 A 65nm 236.5nJ/Classification Neuromorphic Processor with 7.5% Energy Overhead On-Chip Learning Using Direct Spike-Only Feedback

On-chip training of inference engines for neural network and machine learning algorithms shows that on- chip training could serve practical purposes such as compensating for process variations of in-memory computing or adapting to changing environments in real time.

A 4096-Neuron 1M-Synapse 3.8PJ/SOP Spiking Neural Network with On-Chip STDP Learning and Sparse Weights in 10NM FinFET CMOS

A 4096-neuron, 1M-synapse SNN in 10nm FinFET CMOS achieves a peak throughput of 25.2GSOP/s at 0.9V, peak energy efficiency of 3.8pJ/SOP at 525mV, and $2.3\mu \text{W}$ /neuron operation at 450mV. The

14.1 A 510nW 0.41V Low-Memory Low-Computation Keyword-Spotting Chip Using Serial FFT-Based MFCC and Binarized Depthwise Separable Convolutional Neural Network in 28nm CMOS

Ultra-low power is a strong requirement for always-on speech interfaces in wearable and mobile devices, such as Voice Activity Detection (VAD) and Keyword Spotting (KWS) and high compute and memory requirements have preventedAlways-on KWS chips from operating in the $\mathrm{sub}-\mu \mathrm {W}$ range.

A million spiking-neuron integrated circuit with a scalable communication network and interface

Inspired by the brain’s structure, an efficient, scalable, and flexible non–von Neumann architecture is developed that leverages contemporary silicon technology and is well suited to many applications that use complex neural networks in real time, for example, multiobject detection and classification.

Laika: A 5uW Programmable LSTM Accelerator for Always-on Keyword Spotting in 65nm CMOS

The implementation of a KWS system using an LSTM accelerator designed in 65nm CMOS is presented, showing a power consumption of less than 5µW for real-time KWS applications and approximate computing techniques further reduce power consumption, while maintaining high accuracy and reliability.

Spiking Deep Convolutional Neural Networks for Energy-Efficient Object Recognition

A novel approach for converting a deep CNN into a SNN that enables mapping CNN to spike-based hardware architectures and evaluates the resulting SNN on publicly available Defense Advanced Research Projects Agency (DARPA) Neovision2 Tower and CIFAR-10 datasets and shows similar object recognition accuracy as the original CNN.

Design of an Always-On Deep Neural Network-Based 1- $\mu$ W Voice Activity Detector Aided With a Customized Software Model for Analog Feature Extraction

This paper presents an ultra-low-power voice activity detector (VAD). It uses analog signal processing for acoustic feature extraction (AFE) directly on the microphone output, approximate

A 5.1pJ/Neuron 127.3us/Inference RNN-based Speech Recognition Processor using 16 Computing-in-Memory SRAM Macros in 65nm CMOS

A 65nm CMOS speech recognition processor, named Thinker-IM, which employs 16 computing-in-memory (SRAM-CIM) macros for binarized recurrent neural network (RNN) computation, achieving neural energy efficiency of 2.8 × better than state-of-the-art.

Temporarily Fine-Grained Sleep Technique for Near- and Subthreshold Parallel Architectures

A PGS design technique, inspired by the so-called zigzag supercutoff CMOS, is proposed in order to optimize the overheads of mode transitions of PGS in near- and subthreshold voltage circuits.

Robust and energy-efficient asynchronous dynamic pipelines for ultra-low-voltage operation using adaptive keeper control

The proposed method, demonstrated in two widely-used pipelines, directly addresses the asynchronous contention issue by dynamic monitoring of neighboring traffic at each pipeline stage, to eliminate write contention issues.