• Publications
  • Influence
A 64-mW DNN-Based Visual Navigation Engine for Autonomous Nano-Drones
TLDR
This paper presents the first (to the best of the knowledge) demonstration of a navigation engine for autonomous nano-drones capable of closed-loop end-to-end DNN-based visual navigation, and develops a complete methodology for parallel execution of complex DNNs directly on board resource-constrained milliwatt-scale nodes. Expand
A ultra-low-energy convolution engine for fast brain-inspired vision in multicore clusters
TLDR
This work proposes to augment many-core architectures using shared-memory clusters of power-optimized RISC processors with Hardware Convolution Engines (HWCEs): ultra-low energy coprocessors for accelerating convolutions, the main building block of many brain-inspired computer vision algorithms. Expand
GAP-8: A RISC-V SoC for AI at the Edge of the IoT
TLDR
GAP-8 is proposed: a multi-GOPS fully programmable RISC-V IoT-edge computing engine, featuring a 8-core cluster with CNN accelerator, coupled with an ultra-low power MCU with 30 μW state-retentive sleep power. Expand
Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-Things applications
TLDR
This paper introduces Zero-riscy and Micro-ris Cy, two novel RISC-V cores targeting mixed arithmetic/control applications and control-oriented tasks respectively and compares them with the DSP-enhanced open-source Riscy core. Expand
PULP: A Ultra-Low Power Parallel Accelerator for Energy-Efficient and Flexible Embedded Vision
TLDR
PULP (Parallel processing Ultra-Low Power platform), an architecture built on clusters of tightly-coupled OpenRISC ISA cores, with advanced techniques for fast performance and energy scalability that exploit the capabilities of the STMicroelectronics UTBB FD-SOI 28nm technology is proposed. Expand
NEURAghe: Exploiting CPU-FPGA Synergies for Efficient and Flexible CNN Inference Acceleration on Zynq SoCs
TLDR
NEURAghe is presented, a flexible and efficient hardware/software solution for the acceleration of CNNs on Zynq SoCs that leverages the synergistic usage of Zynqu ARM cores and of a powerful and flexible Convolution-Specific Processor deployed on the reconfigurable logic. Expand
An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics
TLDR
Fulmine, a system-on-chip (SoC) based on a tightly-coupled multi-core cluster augmented with specialized blocks for compute-intensive data processing and encryption functions, supporting software programmability for regular computing tasks is proposed. Expand
XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference
TLDR
The XNOR neural engine (XNE), a fully digital configurable hardware accelerator IP for BNNs, integrated within a microcontroller unit (MCU) equipped with an autonomous I/O subsystem and hybrid SRAM/standard cell memory, is introduced. Expand
PULP-NN: accelerating quantized neural networks on parallel ultra-low-power RISC-V processors
TLDR
The key innovation in PULP-NN is a set of kernels for quantized neural network inference, targeting byte and sub-byte data types, down to INT-1, tuned for the recent trend toward aggressive quantization in deep Neural network inference. Expand
Chipmunk: A systolically scalable 0.9 mm2, 3.08Gop/s/mW @ 1.2 mW accelerator for near-sensor recurrent neural network inference
TLDR
CHIPMUNK, a small (<1mm2) hardware accelerator for Long-Short Term Memory RNNs in UMC 65 nm technology capable to operate at a measured peak efficiency up to 3.08Gop/s/mW at 1.24 mW peak power, can achieve real-time phoneme extraction on a demanding RNN topology proposed in [1], consuming less than 13 mW of average power. Expand
...
1
2
3
4
5
...