Machine Learning for Microcontroller-Class Hardware: A Review

  title={Machine Learning for Microcontroller-Class Hardware: A Review},
  author={Swapnil Sayan Saha and Sandeep Singh Sandha and Mani B. Srivastava},
  journal={IEEE Sensors Journal},
The advancements in machine learning (ML) opened a new opportunity to bring intelligence to the low-end Internet-of-Things (IoT) nodes, such as microcontrollers. Conventional ML deployment has high memory and computes footprint hindering their direct deployment on ultraresource-constrained microcontrollers. This article highlights the unique requirements of enabling onboard ML for microcontroller-class devices. Researchers use a specialized model development workflow for resource-limited… 

SENSIPLUS-LM: A Low-Cost EIS-Enabled Microchip Enhanced with an Open-Source Tiny Machine Learning Toolchain

This paper responds to the need for an integrated architecture, able to host both the sensing part and the learning and classifying mechanisms, empowered by ML, directly on board and thus able to overcome some of the limitations presented by off-the-shelf solutions.

Intelligence at the Extreme Edge: A Survey on Reformable TinyML

This work presents a survey on reformable TinyML solutions with the proposal of a novel taxonomy, and explores the suitability of each hierarchical layer for reformability, and discusses how reformable tinyML can impact a few selected industrial areas.

Specially-Designed Out-of-Order Processor Architecture for Microcontrollers

An open-source design is taken using RISC-V ISA as the prototype to implement hardware microarchitecture that quadruples the number of pipelined instructions, greatly alleviating the stalling of the instruction stream with a maximum extra look up table utilization in FPGA implementation.

E-prop on SpiNNaker 2: Exploring online learning in spiking RNNs on neuromorphic hardware

The biologically-inspired E-prop approach for training Spiking Recurrent Neural Networks (SRNNs) is implemented on a prototype of the SpiNNaker 2 neuromorphic system and is significantly more memory-efficient than other spiking neural networks.

Comparison and Evaluation of Machine Learning-Based Classification of Hand Gestures Captured by Inertial Sensors

A comparison of eight different machine learning (ML) classifiers in the task of human hand gesture recognition and classification leads to the conclusion that the LR is the most suitable classifier among tested for on-line applications in resource-constrained environments, due to its lower computational complexity in comparison with other tested algorithms.

Optimization of the 24-Bit Fixed-Point Format for the Laplacian Source

The 24-bit fixed-point format is optimized by optimization of its key parameter and by proposing two adaptation procedures, with the aim to obtain the same performance as of the optimal uniform quantizer in a wide range of variance of input data.

High-availability displacement sensing with multi-channel self mixing interferometry

: Laser self-mixing is in principle a simple and robust general purpose interferometric method, with the additional expressivity which results from nonlinearity. However, it is rather sensitive to



μNAS: Constrained Neural Architecture Search for Microcontrollers

This work builds a neural architecture search (NAS) system, called μNAS, to automate the design of such small-yet-powerful MCU-level networks, and shows that μNAS represents a significant advance in resource-efficient models.

Visual Wake Words Dataset

A new dataset, Visual Wake Words, is presented that represents a common microcontroller vision use-case of identifying whether a person is present in the image or not, and provides a realistic benchmark for tiny vision models.

Compiling KB-sized machine learning models to tiny IoT devices

SeeDot is presented, a domain-specific language to express ML inference algorithms and a compiler that compiles SeeDot programs to fixed-point code that can efficiently run on constrained IoT devices and proposes a novel compilation strategy that reduces the search space for some key parameters used in the fixed- point code.

ML-MCU: A Framework to Train ML Classifiers on MCU-Based IoT Edge Devices

ML-MCU, a framework with the novel Optimized-Stochastic Gradient Descent (Opt-SGD) andOpt-OVO algorithms to enable both binary and multiclass ML classifier training directly on MCUs, enables billions of MCU-based IoT edge devices to self learn/train (offline) after their deployment, using live data from a wide range of IoT use cases.

Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things

Bonsai can make predictions in milliseconds even on slow microcontrollers, can fit in KB of memory, has lower battery consumption than all other algorithms, and achieves prediction accuracies that can be as much as 30% higher than state-of-the-art methods for resource-efficient machine learning.

Quantization and Deployment of Deep Neural Networks on Microcontrollers

A new framework for end-to-end deep neural networks training, quantization and deployment is presented, designed as an alternative to existing inference engines (TensorFlow Lite for Microcontrollers and STM32Cube) and can indeed be easily adjusted and/or extended for specific use cases.

Train++: An Incremental ML Model Training Algorithm to Create Self-Learning IoT Devices

  • B. SudharsanP. YadavJ. BreslinM. Ali
  • Computer Science
    2021 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/IOP/SCI)
  • 2021
Train++ transforms even the most resource-constrained MCU-based IoT edge devices into intelligent devices that can locally build their own knowledge base on-the-fly using the live data, thus creating smart self-learning and autonomous problem-solving devices.

SpArSe: Sparse Architecture Search for CNNs on Resource-Constrained Microcontrollers

It is demonstrated that it is possible to automatically design CNNs which generalize well, while also being small enough to fit onto memory-limited MCUs, and the CNNs found are more accurate and up to $4.35 times smaller than previous approaches, while meeting the strict MCU working memory constraint.

MCUNet: Tiny Deep Learning on IoT Devices

MCUNet, a framework that jointly designs the efficient neural architecture (T TinyNAS) and the lightweight inference engine (TinyEngine), enabling ImageNet-scale inference on microcontrollers, is proposed, suggesting that the era of always-on tiny machine learning on IoT devices has arrived.

Differentiable Network Pruning for Microcontrollers

This work presents a differentiable structured network pruning method for convolutional neural networks, which integrates a model’s MCU-specific resource usage and parameter importance feedback to obtain highly compressed yet accurate classi-cation models.