• Corpus ID: 195345065

Adaptive Precision CNN Accelerator Using Radix-X Parallel Connected Memristor Crossbars

  title={Adaptive Precision CNN Accelerator Using Radix-X Parallel Connected Memristor Crossbars},
  author={Jaeheum Lee and Jason Kamran Eshraghian and Kyoungrok Cho and Kamran Eshraghian},
Neural processor development is reducing our reliance on remote server access to process deep learning operations in an increasingly edge-driven world. By employing in-memory processing, parallelization techniques, and algorithm-hardware co-design, memristor crossbar arrays are known to efficiently compute large scale matrix-vector multiplications. However, state-of-the-art implementations of negative weights require duplicative column wires, and high precision weights using single-bit… 

A Memristor-CMOS Braun Multiplier Array for Arithmetic Pipelining

A memristor-CMOS hybrid multiplier which uses a Braun structure to enable segmentation of the multiplier array to process various multiplication operations simultaneously with reconfigurable bit-widths by varying a control signal, all on the same multiplier enables an increase in computational throughput.

Low Power In-Memory Implementation of Ternary Neural Networks with Resistive RAM-Based Synapse

It is shown based on neural network simulation on the CIFAR-10 image recognition task that going from binary to ternary neural networks significantly increases neural network performance, highlighting that AI circuits function may sometimes be revisited when operated in low power regimes.

Implementation of Ternary Weights With Resistive RAM Using a Single Sense Operation Per Synapse

It is shown based on neural network simulation on the CIFAR-10 image recognition task that the use of ternary neural networks significantly increases neural network performance, with regards to binary ones, which are often preferred for inference hardware.

Single Crossbar Array of Memristors With Bipolar Inputs for Neuromorphic Image Recognition

The proposed crossbar architecture of memristors with bipolar inputs for an image recognition application shows a recognition rate improved by 5%, 7% and 4% over that of the memristor binarized neural network, the complementary architecture of the Memristor crossbar and the twin architecture when recognizing 10 images.

An 8-bit Radix-4 Non-Volatile Parallel Multiplier

An 8-bit radix-4 non-volatile parallel multiplier is proposed, with improved computational capabilities, and the corresponding booth encoding scheme, read-out circuit, simplified Wallace tree, and Manchester carry chain are presented, which help to short the delay of the proposed multiplier.

Reconfigurable multiplier architecture based on memristor-cmos with higher flexibility

This paper introduces memristor-CMOS based reconfigurable multiplier which provides flexible multiplication according to various bit-width in digital signal processing systems.

A Memristor-Based Compressive Sampling Encoder with Dynamic Rate Control for Low-Power Video Streaming

A memristor-based CS encoder that can be integrated with conventional image sensors to achieve high performance with low power consumption and hardware overheads is proposed and a self-adaptive compressing rate control mechanism is devised to maximize the performance of the proposed technique.

High-Density Memristor-CMOS Ternary Logic Family

This paper systematically design, simulate and experimentally verify the primitive logic functions: the ternary AND, OR and NOT gates, and obtains close to an order of magnitude improvement in data density over conventional CMOS logic, and a reduction of switching speed by a factor of 13 over prior state-of-the-art Ternary memristor results.

Exploiting Memristors for Compressive Sensing Applications

The proposed CS system can achieve higher energy efficiency, less hardware complexities, and with very good recovery quality, compared to existing implementations of both CS system and conventional codec method.



Analog Weights in ReRAM DNN Accelerators

This paper presents a novel scheme in alleviating the single-bit-per-device restriction by exploiting frequency dependence of v-i plane hysteresis, and assigning kernel information not only to the device conductance but also partially distributing it to the frequency of a time-varying input.

Analogue signal and image processing with large memristor crossbars

It is shown that reconfigurable memristor crossbars composed of hafnium oxide memristors on top of metal-oxide-semiconductor transistors are capable of analogue vector-matrix multiplication with array sizes of up to 128 × 64 cells.

NN compactor: Minimizing memory and logic resources for small neural networks

A fully automatic framework called NN Compactor is presented that generates a compact neural accelerator by minimizing the memory requirements of synaptic weights through dual-track quantization and minimizing the logic requirements of PUs with minimum recognition accuracy loss.

Neuromorphic computing using non-volatile memory

The relevant virtues and limitations of these devices are assessed, in terms of properties such as conductance dynamic range, (non)linearity and (a)symmetry of conductance response, retention, endurance, required switching power, and device variability.

A memristor-based neuromorphic engine with a current sensing scheme for artificial neural network applications

This work proposes a new memristor crossbar based computing engine design by leveraging a current sensing scheme and demonstrates a good computation accuracy, e.g., 96.6% classification accuracy for MNIST handwritten digit in a two-layer design.

Binary convolutional neural network on RRAM

An RRAM crossbar-based accelerator is proposed for BCNN forward process and shows much smaller accuracy loss than multi-bit CNNs for LeNet on MNIST when considering device variation.

Memristor Crossbar-Based Neuromorphic Computing System: A Case Study

The results show that the hardware-based training scheme proposed in the paper can alleviate and even cancel out the majority of the noise issue and apply it to brain-state-in-a-box (BSB) neural networks.

Design and Analysis of a Hardware CNN Accelerator

A systolic array based architecture called ConvAU is designed and implemented to efficiently accelerate dense matrix multiplication operations in CNNs and finds that ConvAU gives a 200x improvement in TOPs/W when compared to a NVIDIA K80 GPU and a 1.9x improvement whenCompared to the TPU.

Binary Weighted Memristive Analog Deep Neural Network for Near-Sensor Edge Processing

It is shown that an analog deep neural network with binary weight update through backpropagation algorithm using binary state memristive devices can be successfully used for image processing task and has the advantage of lower power consumption and small on-chip area in comparison with digital counterparts.

Efficient and self-adaptive in-situ learning in multilayer memristor neural networks

This work monolithically integrate hafnium oxide-based memristors with a foundry-made transistor array into a multiple-layer memristor neural network and achieves competitive classification accuracy on a standard machine learning dataset.