• Publications
  • Influence
Natural Evolution Strategies
TLDR
NES is presented, a novel algorithm for performing real-valued dasiablack boxpsila function optimization: optimizing an unknown objective function where algorithm-selected function measurements constitute the only information accessible to the method.
Framewise phoneme classification with bidirectional LSTM networks
TLDR
It is found that bidirectional LSTM outperforms both RNNs and unidirectionalLSTM, and the significance of framewise phoneme classification to continuous speech recognition and the validity of usingbidirectional networks for online causal tasks is discussed.
A First Look at Music Composition using LSTM Recurrent Neural Networks
TLDR
Long Short-Term Memory is shown to be able to play the blues with good timing and proper structure as long as one is willing to listen, and once the network has found the relevant structure it does not drift from it.
On learning how to learn learning strategies
TLDR
This paper introduces the "incremental self-improvement paradigm", a reinforcement learning system able to "shift its inductive bias" in a universal way, and a particular implementation based on the novel paradigm is presented.
Reinforcement Learning Upside Down: Don't Predict Rewards - Just Map Them to Actions
TLDR
This work transforms reinforcement learning into a form of supervised learning (SL) by turning traditional RL on its head, calling this Upside Down RL (UDRL), and conceptually simplify an approach for teaching a robot to imitate humans.
Simple Principles of Metalearning
TLDR
The metalearning principle allows for embedding the learner''s policy modification strategy within the policy itself (self-reference) and is tested in complex, non-Markovian, changing environments (``POMDPs'').
Hindsight policy gradients
TLDR
This paper shows how hindsight can be introduced to likelihood-ratio policy gradient methods, generalizing this capacity to an entire class of highly successful algorithms.
Long Short-Term Memory Learns Context Free and Context Sensitive Languages
TLDR
LSTM variants are also the first RNNs to learn a context sensitive language (\mbox{CSL}), namely $a^nb^n c^n$.
A linear time natural evolution strategy for non-separable functions
TLDR
A novel Natural Evolution Strategy (NES) variant, the Rank-One NES (R1-NES), which uses a low-rank approximation of the search distribution covariance matrix, and excels in solving high-dimensional non-separable problems.
...
...