Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning

@article{Elfwing2018SigmoidWeightedLU,
  title={Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning},
  author={Stefan Elfwing and Eiji Uchibe and Kenji Doya},
  journal={Neural networks : the official journal of the International Neural Network Society},
  year={2018},
  volume={107},
  pages={
          3-11
        }
}
  • Stefan ElfwingE. UchibeK. Doya
  • Published 10 February 2017
  • Computer Science
  • Neural networks : the official journal of the International Neural Network Society

Figures and Tables from this paper

Piecewise Linear Units Improve Deep Neural Networks

Across a distribution of 30 experiments, it is shown that for the same model architecture, hyperparameters, and pre-processing, PiLU significantly outperforms ReLU: reducing classification error by 18.53% on CIFAR-10 and 13.13% on TSP, for a minor increase in the number of neurons.

Enhanced Reinforcement Learning with Targeted Dropout

The researcher used the Targeted Dropout strategy for RLs DQN that makes straight into learning and would be necessary to deal with MDPs with huge or continuous state and action spaces and shows that the proposed algorithm for enhancing the RL's D QN is more accurate in finding the best action to learn to achieve maximum reward.

Improving the Performance of Deep Neural Networks Using Two Proposed Activation Functions

The statistical study of the overall experiments on both classification categories indicates that the proposed activation functions are robust and superior among all the competitive activation functions in terms of average accuracy.

Metatrace Actor-Critic: Online Step-Size Tuning by Meta-gradient Descent for Reinforcement Learning Control

Meta-gradient descent is applied to derive a set of step-size tuning algorithms specifically for online RL control with eligibility traces, and results show that the meta-step-size parameter of Metatrace is easy to set, Met atrace can speed learning, and Metat race can allow an RL algorithm to deal with non-stationarity in the learning task.

Activation Adaptation in Neural Networks

This work re-formalizing activation functions as CDF generalizes the class of activation function extensively to study: i) skewness and ii) smoothness of activation functions to study.

Metatrace: Online Step-size Tuning by Meta-gradient Descent for Reinforcement Learning Control

Meta-gradient descent is applied to derive a set of step-size tuning algorithms specifically for online RL control with eligibility traces, and results show that the meta-step-size parameter of Metatrace is easy to set, Met atrace can speed learning, and Metat race can allow an RL algorithm to deal with non-stationarity in the learning task.

Neuroevolution based hierarchical activation function for long short-term model network

A differential evolution algorithm (DEA)-based hierarchical combined activation to surrogate the default activation functions of the LSTM cell is proposed to discover an optimal combination of function for the L STM network.

NEO: NEuro-Inspired Optimization—A Fractional Time Series Approach

This paper proposes a NEuro-inspired Optimization (NEO) method that leverages the long memory property of fractional time series exhibiting non-exponential power-law decay of trajectories, which contrasts with the short memory characteristics of currently used methods.

Adaptive Rational Activations to Boost Deep Reinforcement Learning

It is demonstrated that equipping popular algorithms with (joint) rational activations leads to consistent improvements on Atari games, notably making DQN competitive to DDQN and Rainbow.

Regularized Flexible Activation Function Combination for Deep Neural Networks

A novel family of flexible activation functions that can replace sigmoid or tanh in LSTM cells are implemented, as well as a new family by combining ReLU and ELUs, and two new regularisation terms based on assumptions as prior knowledge are introduced.
...

References

SHOWING 1-10 OF 33 REFERENCES

Neural Network Ensembles in Reinforcement Learning

This paper proposes a meta-algorithm to learn state- or state-action values in a neural network ensemble, formed by a committee of multiple agents, and shows that the committee benefits from the diversity on the estimation of the values.

On-line Q-learning using connectionist systems

Simulations show that on-line learning algorithms are less sensitive to the choice of training parameters than backward replay, and that the alternative update rules of MCQ-L and Q( ) are more robust than standard Q-learning updates.

Dueling Network Architectures for Deep Reinforcement Learning

This paper presents a new neural network architecture for model-free reinforcement learning that leads to better policy evaluation in the presence of many similar-valued actions and enables the RL agent to outperform the state-of-the-art on the Atari 2600 domain.

High-Dimensional Function Approximation for Knowledge-Free Reinforcement Learning: a Case Study in SZ-Tetris

It is shown that a large systematic n-tuple network allows the classical temporal difference learning algorithm to obtain similar average performance to VD-CMA-ES, but at 20 times lower computational expense, leading to the best policy for SZ-Tetris known to date.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Issues in Using Function Approximation for Reinforcement Learning

This paper gives a theoretical account of the phenomenon, deriving conditions under which one may expected it to cause learning to fail, and presents experimental results which support the theoretical findings.

Deep Reinforcement Learning with Double Q-Learning

This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games.

Asynchronous Methods for Deep Reinforcement Learning

A conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers and shows that asynchronous actor-critic succeeds on a wide variety of continuous motor control problems as well as on a new task of navigating random 3D mazes using a visual input.

TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play

The latest version of TD-Gammon is now estimated to play at a strong master level that is extremely close to the world's best human players.

Temporal Differences-Based Policy Iteration and Applications in Neuro-Dynamic Programming

We introduce a new policy iteration method for dynamic programming problems with discounted and undiscounted cost. The method is based on the notion of temporal differences, and is primarily geared