# An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning

@article{Watanabe2020AnFO, title={An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning}, author={Hirohisa Watanabe and Mineto Tsukada and Hiroki Matsutani}, journal={2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)}, year={2020}, pages={96-103} }

DQN (Deep Q-Network) is a method to perform Q-learning for reinforcement learning using deep neural networks. DQNS require a large buffer and batch processing for an experience replay and rely on a backpropagation based iterative optimization, making them difficult to be implemented on resource-limited edge devices. In this paper, we propose a lightweight on-device reinforcement learning approach for low-cost FPGA devices. It exploits a recently proposed neural-network based on-device learningâ€¦Â

## 10 Citations

### A Packet Routing using Lightweight Reinforcement Learning Based on Online Sequential Learning

- Computer Science2022 Tenth International Symposium on Computing and Networking Workshops (CANDARW)
- 2022

This work proposes OS-ELM QN (Q-Network), a lightweight machine learning algorithm with a prioritized experience replay buffer and multi-agent learning function to improve the learning performance and is compared to a deep reinforcement learning based packet routing method using a network simulator.

### A Survey of Domain-Specific Architectures for Reinforcement Learning

- Computer ScienceIEEE Access
- 2022

FPGA-based implementations are the focus of this work, but GPU-based approaches are considered as well, and possible areas for future work are suggested, based on the preceding discussion of existing architectures.

### Efficient Compressed Ratio Estimation using Online Sequential Learning for Edge Computing

- Computer ScienceArXiv
- 2022

This study developed an efficient RL method for edge devices, referred to as the actor--critic online sequential extreme learning machine (AC-OSELM), and a system to compress data by estimating an appropriate compression ratio on the edge using AC- OSELM.

### Performance improvement of reinforcement learning algorithms for online 3D bin packing using FPGA

- Computer ScienceAIMLSystems
- 2022

This paper uses FPGA as a hardware accelerator to reduce inference time of DQN as well as its pre-/post-processing steps, which allows the optimised algorithm to cover the entire search space within the given time constraints.

### TD3lite: FPGA Acceleration of Reinforcement Learning with Structural and Representation Optimizations

- Computer Science2022 32nd International Conference on Field-Programmable Logic and Applications (FPL)
- 2022

To address the resource and computational overhead due to inference and training of the multiple neural networks of TD3, this work proposes TD3lite, an integrated approach consisting of a network sharing technique combined with bitwidth-optimized block floating-point arithmetic.

### A Hardware Implementation for Deep Reinforcement Learning Machine

- Computer Science2022 RIVF International Conference on Computing and Communication Technologies (RIVF)
- 2022

This paper proposes a hardware architecture to implement the DQN algorithm, suitable for real-time applications, and its main features are low power and suitable for limited hardware resources.

### FAQ: A Flexible Accelerator for Q-Learning with Configurable Environment

- Computer Science2022 IEEE 33rd International Conference on Application-specific Systems, Architectures and Processors (ASAP)
- 2022

Reinforcement Learning is an area of machine learning that is concerned with optimizing the behavior of an agent in an environment by maximizing cumulative rewards. This can be done with classicalâ€¦

### E2HRL: An Energy-efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning

- Computer ScienceACM Trans. Design Autom. Electr. Syst.
- 2022

The proposed Energy Efficient Hierarchical Reinforcement Learning (E2HRL), which is a scalable hardware architecture for RL applications, utilizes a cross-layer design methodology for achieving better energy efficiency, smaller model size, higher accuracy, and system integration at the software and hardware layers.

### Machine Learning for the Control and Monitoring of Electric Machine Drives: Advances and Trends

- Computer Science
- 2021

This review paper systematically summarizes the existing literature on utilizing machine learning techniques for the control and monitoring of electric machine drives and provides some outlook toward promoting its widespread application in the industry with a focus on deploying ML algorithms onto embedded system-on-chip (SoC) field-programmable gate array (FPGA) devices.

### Binarized P-Network: Deep Reinforcement Learning of Robot Control from Raw Images on FPGA

- Computer ScienceIEEE Robotics and Automation Letters
- 2021

This letter proposes a novel DRL algorithm called Binarized P-Network (BPN), which learns image-input control policies using Binarization Convolutional Neural Networks (BCNNs), and adopts a robust value update scheme called Conservative Value Iteration, which is tolerant of function approximation errors.

## 18 References

### A Neural Network-Based On-Device Learning Anomaly Detector for Edge Devices

- Computer ScienceIEEE Transactions on Computers
- 2020

Experiments show that ONLAD has favorable anomaly detection capability in an environment that simulates concept drift, andONLAD Core realizes on-device learning for edge devices at low power consumption, which realizes standalone execution where data transfers between edge and server are not required.

### Adam: A Method for Stochastic Optimization

- Computer ScienceICLR
- 2015

This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.

### An Area-Efficient Implementation of Recurrent Neural Network Core for Unsupervised Anomaly Detection

- Computer Science2020 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS)
- 2020

Echo State Network (ESN) is analyzed, which is a simple form of Recurrent Neural Networks (RNNs), and its area-efficient implementation is evaluated in terms of the anomaly detection capability and area.

### Spectral Normalization for Generative Adversarial Networks

- Computer ScienceICLR
- 2018

This paper proposes a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator and confirms that spectrally normalized GANs (SN-GANs) is capable of generating images of better or equal quality relative to the previous training stabilization techniques.

### Regularized online sequential learning algorithm for single-hidden layer feedforward neural networks

- Computer SciencePattern Recognit. Lett.
- 2011

### Spectral Norm Regularization for Improving the Generalizability of Deep Learning

- Computer ScienceArXiv
- 2017

This work proposes a simple and effective regularization method, referred to as spectral norm regularization, which penalizes the high spectral norm of weight matrices in neural networks, which exhibits better generalizability than other baseline methods.

### Reinforcement learning for robots using neural networks

- Computer Science
- 1992

This dissertation concludes that it is possible to build artificial agents than can acquire complex control policies effectively by reinforcement learning and enable its applications to complex robot-learning problems.

### Multilayer feedforward networks are universal approximators

- Computer Science, MathematicsNeural Networks
- 1989

### Human-level control through deep reinforcement learning

- Computer ScienceNature
- 2015

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.