Cain: Automatic Code Generation for Simultaneous Convolutional Kernels on Focal-plane Sensor-processors

  title={Cain: Automatic Code Generation for Simultaneous Convolutional Kernels on Focal-plane Sensor-processors},
  author={Edward Stow and Riku Murai and Sajad Saeedi and Paul H. J. Kelly},
Focal-plane Sensor-processors (FPSPs) are a camera technology that enable low power, high frame rate computation, making them suitable for edge computation. Unfortunately, these devices’ limited instruction sets and registers make developing complex algorithms difficult. In this work, we present Cain – a compiler that targets SCAMP-5, a general-purpose FPSP – which generates code from multiple convolutional kernels. As an example, given the convolutional kernels for an MNIST digit recognition… 
1 Citations
Systematic Comparison of Path Planning Algorithms using PathBench
The benchmarking ability of PathBench is explored in this paper by comparing algorithms across five different hardware systems and three different map types, including built-in PathBench maps, video game maps, and maps from real world databases.


A Camera That CNNs: Towards Embedded Neural Networks on Pixel Processor Arrays
A convolutional neural network implementation for pixel processor array (PPA) sensors, a first step towards embedding neural network processing capability directly onto the focal plane of a sensor.
A 100,000 fps vision sensor with embedded 535GOPS/W 256×256 SIMD processor array
A vision chip operating with 1.9pJ/OP efficiency has been fabricated in 0.18μm CMOS and exploited to conduct real-time image processing operations at 100,000fps, locating a closed-shape object from amongst clutter.
Focal-Plane Sensor-Processor Chips
This book provides an overview of focal plane chip technology, smart imagers and cellular wave computers, along with numerous examples of current vision chips, 3D sensor-processor arrays and their applications, and their near- and mid-term research trends.
Visual Odometry for Pixel Processor Arrays
This work introduces methods of image scaling, rotation and alignment which are performed solely upon the PPA itself and form the basis for conducting motion estimation, and demonstrates the algorithms on a SCAMP-5 vision chip, achieving frame rates >1000Hz at ~2W power consumption.
Neural Sensors: Learning Pixel Exposures for HDR Imaging and Video Compressive Sensing With Programmable Sensors
This work introduces neural sensors as a methodology to optimize per-pixel shutter functions jointly with a differentiable image processing method, such as a neural network, in an end-to-end fashion and demonstrates how to leverage emerging programmable and re-configurable sensor–processors to implement the optimized exposure functions directly on the sensor.
Locating high speed multiple objects using a SCAMP-5 vision-chip
Presented in this paper is a demonstration system that uses a low-power SCAMP-5 256×256 vision-chip to locate and count multiple objects moving at high speed along arbitrary trajectories. The
Gradient-based learning applied to document recognition
This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques.
Linnea: Automatic Generation of Efficient Linear Algebra Programs
Linnea is a code generator for linear algebra problems that uses a custom best-first search algorithm to find a first solution in less than a second, and increasingly better solutions when given more time.
Automatic Generation of Efficient Linear Algebra Programs
Linnea is developing Linnea, a code generator for linear algebra problems that takes a high-level description of a linear algebra problem and produces as output an efficient sequence of calls to high-performance kernels.
TensorFlow: A system for large-scale machine learning
The TensorFlow dataflow model is described and the compelling performance that Tensor Flow achieves for several real-world applications is demonstrated.