• Corpus ID: 85518354

Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to ATARI games

@article{Patel2019ImprovedRO,
  title={Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to ATARI games},
  author={Devdhar Patel and Hananel Hazan and Daniel J. Saunders and Hava T. Siegelmann and Robert Thijs Kozma},
  journal={ArXiv},
  year={2019},
  volume={abs/1903.11012}
}
Deep Reinforcement Learning (RL) demonstrates excellent performance on tasks that can be solved by trained policy. It plays a dominant role among cutting-edge machine learning approaches using multi-layer Neural networks (NNs). At the same time, Deep RL suffers from high sensitivity to noisy, incomplete, and misleading input data. Following biological intuition, we involve Spiking Neural Networks (SNNs) to address some deficiencies of deep RL solutions. Previous studies in image classification… 

Figures and Tables from this paper

Evolutionary and spike-timing-dependent reinforcement learning train spiking neuronal network motor control

This work used spike-timing-dependent reinforcement learning (STDP-RL) and evolutionary strategy (EVOL) with SNNs to solve the CartPole reinforcementlearning (RL) control problem and revealed EVOL as a powerful method to training SNN's to perform sensory-motor behaviors.

Minibatch Processing in Spiking Neural Networks

To the knowledge, this is the first general-purpose implementation of mini-batch processing in a spiking neural networks simulator, which works with arbitrary neuron and synapse models and shows the effectiveness of large batch sizes in two SNN application domains.

Multi-timescale biological learning algorithms train spiking neuronal network motor control

Compared to the STDP-RL and EVOL algorithms operating on their own, the interleaved training paradigm produced enhanced robustness in performance, with different strategies revealed through analysis of the sensory/motor mappings.

Minibatch Processing for Speed-up and Scalability of Spiking Neural Network Simulation

This work provides an implementation of mini-batch processing applied to clock-based SNN simulation, leading to drastically increased data throughput and different parameter reduction techniques are shown to produce different learning outcomes in a simulation of networks trained with spike-timing-dependent plasticity.

Dependency-Aware Computation Offloading in Mobile Edge Computing: A Reinforcement Learning Approach

A model-free approach based on reinforcement learning (RL) is proposed, i.e., a Q-learning approach that adaptively learns to optimize the offloading decision and energy consumption jointly by interacting with the network environment, aiming at minimizing the execution time for mobile applications with constraints on energy consumption.

Training spiking neuronal networks to perform motor control using reinforcement and evolutionary learning

This work trained SNNs to solve the CartPole reinforcement learning (RL) control problem using two learning mechanisms operating at different timescales: spike-timing-dependent reinforcementlearning (STDP-RL) and evolutionary strategy (EVOL), which revealed EVOL as a powerful method for training Snns to perform sensory-motor behaviors.

References

SHOWING 1-10 OF 49 REFERENCES

SuperSpike: Supervised Learning in Multilayer Spiking Neural Networks

SuperSpike is derived, a nonlinear voltage-based three-factor learning rule capable of training multilayer networks of deterministic integrate-and-fire neurons to perform nonlinear computations on spatiotemporal spike patterns.

Theory and Tools for the Conversion of Analog to Spiking Convolutional Neural Networks

A novel theory is provided that explains why traditional CNNs can be converted into deep spiking neural networks (SNNs), and several new tools are derived to convert a larger and more powerful class of deep networks into SNNs.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Fast-classifying, high-accuracy spiking deep networks through weight and threshold balancing

The method for converting an ANN into an SNN enables low-latency classification with high accuracies already after the first output spike, and compared with previous SNN approaches it yields improved performance without increased training time.

Deep Learning With Spiking Neurons: Opportunities and Challenges

This review addresses the opportunities that deep spiking networks offer and investigates in detail the challenges associated with training SNNs in a way that makes them competitive with conventional deep learning, but simultaneously allows for efficient mapping to hardware.

Dueling Network Architectures for Deep Reinforcement Learning

This paper presents a new neural network architecture for model-free reinforcement learning that leads to better policy evaluation in the presence of many similar-valued actions and enables the RL agent to outperform the state-of-the-art on the Atari 2600 domain.

Measuring and Characterizing Generalization in Deep Reinforcement Learning

The extent to which deep Q-networks learn generalized representations is called into question, and it is suggested that more experimentation and analysis is necessary before claims of representation learning can be supported.

Deep Reinforcement Learning with Double Q-Learning

This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.

Spatio-Temporal Backpropagation for Training High-Performance Spiking Neural Networks

A spatio-temporal backpropagation (STBP) algorithm for training high-performance SNNs is proposed, which combines the layer-by-layer spatial domain (SD) and the timing-dependent temporal domain (TD), and does not require any additional complicated skill.

Gradient Descent for Spiking Neural Networks

A gradient descent method for optimizing spiking network models by introducing a differentiable formulation of spiking networks and deriving the exact gradient calculation offers a general purpose supervised learning algorithm for spiking neural networks, thus advancing further investigations on spike-based computation.