Author pages are created from data sourced from our academic publisher partnerships and public sources.
Share This Author
Human-level control through deep reinforcement learning
- Volodymyr Mnih, K. Kavukcuoglu, +16 authors D. Hassabis
- Computer Science, MedicineNature
- 26 February 2015
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Playing Atari with Deep Reinforcement Learning
- Volodymyr Mnih, K. Kavukcuoglu, +4 authors Martin A. Riedmiller
- Computer ScienceArXiv
- 19 December 2013
This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
Striving for Simplicity: The All Convolutional Net
- Jost Tobias Springenberg, A. Dosovitskiy, T. Brox, Martin A. Riedmiller
- Computer Science, MathematicsICLR
- 21 December 2014
It is found that max-pooling can simply be replaced by a convolutional layer with increased stride without loss in accuracy on several image recognition benchmarks.
Deterministic Policy Gradient Algorithms
- D. Silver, Guy Lever, N. Heess, T. Degris, Daan Wierstra, Martin A. Riedmiller
- Mathematics, Computer ScienceICML
- 21 June 2014
This paper introduces an off-policy actor-critic algorithm that learns a deterministic target policy from an exploratory behaviour policy and demonstrates that deterministic policy gradient algorithms can significantly outperform their stochastic counterparts in high-dimensional action spaces.
A direct adaptive method for faster backpropagation learning: the RPROP algorithm
- Martin A. Riedmiller, H. Braun
- Computer ScienceIEEE International Conference on Neural Networks
- 28 March 1993
A learning algorithm for multilayer feedforward networks, RPROP (resilient propagation), is proposed that performs a local adaptation of the weight-updates according to the behavior of the error function to overcome the inherent disadvantages of pure gradient-descent.
Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method
- Martin A. Riedmiller
- Computer ScienceECML
- 3 October 2005
NFQ, an algorithm for efficient and effective training of a Q-value function represented by a multi-layer perceptron, is introduced and it is shown empirically, that reasonably few interactions with the plant are needed to generate control policies of high quality.
Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images
- Manuel Watter, Jost Tobias Springenberg, J. Boedecker, Martin A. Riedmiller
- Computer Science, MathematicsNIPS
- 24 June 2015
Embed to Control is introduced, a method for model learning and control of non-linear dynamical systems from raw pixel images that is derived directly from an optimal control formulation in latent space and exhibits strong performance on a variety of complex control problems.
Discriminative Unsupervised Feature Learning with Convolutional Neural Networks
- A. Dosovitskiy, Jost Tobias Springenberg, Martin A. Riedmiller, T. Brox
- Computer ScienceNIPS
- 26 June 2014
This paper presents an approach for training a convolutional neural network using only unlabeled data and trains the network to discriminate between a set of surrogate classes, finding that this simple feature learning algorithm is surprisingly successful when applied to visual object recognition.
Multimodal deep learning for robust RGB-D object recognition
- Andreas Eitel, Jost Tobias Springenberg, Luciano Spinello, Martin A. Riedmiller, W. Burgard
- Computer ScienceIEEE/RSJ International Conference on Intelligent…
- 24 July 2015
This paper leverages recent progress on Convolutional Neural Networks (CNNs) and proposes a novel RGB-D architecture for object recognition that is composed of two separate CNN processing streams - one for each modality - which are consecutively combined with a late fusion network.
Emergence of Locomotion Behaviours in Rich Environments
This paper explores how a rich environment can help to promote the learning of complex behavior, and finds that this encourages the emergence of robust behaviours that perform well across a suite of tasks.