Deep Reinforcement Learning for Accelerating the Convergence Rate
@inproceedings{Fu2016DeepRL, title={Deep Reinforcement Learning for Accelerating the Convergence Rate}, author={Jie Fu and Zichuan Lin and Danlu Chen and Ritchie Ng and Miao Liu and Nicholas L{\'e}onard and Jiashi Feng and Tat-Seng Chua}, year={2016} }
In this paper, we propose a principled deep reinforcement learning (RL) approach that is able to accelerate the convergence rate of general deep neural networks (DNNs. [] Key Method The state features of the agent are learned from the weight statistics of the optimizee during training. The reward function of this agent is designed to learn policies that minimize the optimizee’s training time given a certain performance goal.
2 Citations
Precise Evaluation for Continuous Action Control in Reinforcement Learning
- Computer ScienceHPCCT/BDAI
- 2019
An accurate evaluation mechanism and corresponding objective function are proposed to accelerate the reinforcement learning training process and experimental results show that the accurate evaluation of log-cosh objective function can make the robot arm grasp the task more quickly, converge and complete the training task.
AI and Wargaming
- Economics, Computer ScienceArXiv
- 2020
What features of wargames distinguish them from the usual AI testbeds, and which recent AI advances are best suited to address these wargame-specific features are reviewed.
References
SHOWING 1-10 OF 24 REFERENCES
Using Deep Q-Learning to Control Optimization Hyperparameters
- Computer ScienceArXiv
- 2016
A novel definition of the reinforcement learning state, actions and reward function that allows a deep Q-network to learn to control an optimization hyperparameter is presented and it is shown that the DQN's q-values associated with optimal action converge and that the Q-gradient descent algorithms outperform gradient descent with an Armijo or nonmonotone line search.
Recurrent Reinforcement Learning: A Hybrid Approach
- Computer ScienceArXiv
- 2015
This work investigates a deep-learning approach to learning the representation of states in partially observable tasks, with minimal prior knowledge of the domain, and proposes a new family of hybrid models that combines the strength of both supervised learning and reinforcement learning, trained in a joint fashion.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
- Computer ScienceICLR
- 2016
This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods.
Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves
- Computer ScienceIJCAI
- 2015
This paper mimics the early termination of bad runs using a probabilistic model that extrapolates the performance from the first part of a learning curve, enabling state-of-the-art hyperparameter optimization methods for DNNs to find DNN settings that yield better performance than those chosen by human experts.
Online Batch Selection for Faster Training of Neural Networks
- Computer ScienceArXiv
- 2015
This work investigates online batch selection strategies for two state-of-the-art methods of stochastic gradient-based optimization, AdaDelta and Adam, and proposes a simple strategy where all datapoints are ranked w.r.t. their latest known loss value and the probability to be selected decays exponentially as a function of rank.
Weight Features for Predicting Future Model Performance of Deep Neural Networks
- Computer ScienceIJCAI
- 2016
The findings demonstrate that using weight features can help construct prediction models with a smaller number of training samples and terminate underperformance runs at an earlier stage of the learning process of DNNs than the conventional use of learning curve, thus facilitating the speed-up of hyperparameter searches.
Adam: A Method for Stochastic Optimization
- Computer ScienceICLR
- 2015
This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
No more pesky learning rates
- Computer ScienceICML
- 2013
The proposed method to automatically adjust multiple learning rates so as to minimize the expected error at any one time relies on local gradient variations across samples, making it suitable for non-stationary problems.
Learning to learn by gradient descent by gradient descent
- Computer ScienceNIPS
- 2016
This paper shows how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way.
Scalable Bayesian Optimization Using Deep Neural Networks
- Computer ScienceICML
- 2015
This work shows that performing adaptive basis function regression with a neural network as the parametric form performs competitively with state-of-the-art GP-based approaches, but scales linearly with the number of data rather than cubically, which allows for a previously intractable degree of parallelism.