Learn More
This paper investigates a novel model-free reinforcement learning architecture, the Natural Actor-Critic. The actor updates are based on stochastic policy gradients employing Amari's natural gradient approach, while the critic obtains both the natural policy gradient and additional parameters of a value function simultaneously by linear regression. We show(More)
Many motor skills in humanoid robotics can be learned using parametrized motor primitives as done in imitation learning. However, most interesting motor learning problems are high-dimensional reinforcement learning problems often beyond the reach of current methods. In this paper, we extend previous work on policy learning from the immediate reward case to(More)
The acquisition and improvement of motor skills and control policies for robotics from trial and error is of essential importance if robots should ever leave precisely pre-structured environments. However, to date only few existing reinforcement learning methods have been scaled into the domains of high-dimensional robots such as manipulator, legged or(More)
Anaphase initiation has been postulated to be controlled through the ubiquitin-dependent proteolysis of an unknown inhibitor. This process involves the anaphase promoting complex (APC), a specific ubiquitin ligase that has been shown to be involved in mitotic cyclin degradation. Previous studies demonstrated that in Saccharomyces cerevisiae, Pds1 protein is(More)
The ordered activation of the ubiquitin protein ligase anaphase-promoting complex (APC) or cyclosome by CDC20 in metaphase and by CDH1 in telophase is essential for anaphase and for exit from mitosis, respectively. Here, we show that CDC20 can only bind to and activate the mitotically phosphorylated form of the Xenopus and the human APC in vitro. In(More)
Reinforcement learning offers to robotics a framework and set of tools for the design of sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in reinforcement learning. The relationship between disciplines has sufficient promise to be likened to that(More)
We present a model-free reinforcement learning method for partially observable Markov decision problems. Our method estimates a likelihood gradient by sampling directly in parameter space, which leads to lower variance gradient estimates than obtained by regular policy gradient methods. We show that for several complex control tasks, including robust(More)
In mammalian somatic-cell cycles, progression through the G1-phase restriction point and initiation of DNA replication are controlled by the ability of the retinoblastoma tumour-suppressor protein (pRb) family to regulate the E2F/DP transcription factors. Continuing transcription of E2F target genes beyond the G1/S transition is required for coordinating(More)