First Experiments with PowerPlay

@article{Srivastava2013FirstEW,
  title={First Experiments with PowerPlay},
  author={Rupesh Kumar Srivastava and Bas R. Steunebrink and J{\"u}rgen Schmidhuber},
  journal={Neural networks : the official journal of the International Neural Network Society},
  year={2013},
  volume={41},
  pages={
          130-6
        }
}
Like a scientist or a playing child, POWERPLAY (Schmidhuber, 2011) not only learns new skills to solve given problems, but also invents new interesting problems by itself. By design, it continually comes up with the fastest to find, initially novel, but eventually solvable tasks. It also continually simplifies or compresses or speeds up solutions to previous tasks. Here we describe first experiments with POWERPLAY. A self-delimiting recurrent neural network SLIM RNN (Schmidhuber, 2012) is used… Expand
PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
TLDR
This work focuses on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Expand
One Big Net For Everything
TLDR
The incremental training of an increasingly general problem solver, continually learning to solve new tasks without forgetting previous skills is applied, to greatly speed up subsequent learning of additional, novel but algorithmically related skills. Expand
C OMPETITIVE EXPERIENCE REPLAY
Deep learning has achieved remarkable successes in solving challenging reinforcement learning (RL) problems. However, it still often suffers from the need to engineer a reward function that not onlyExpand
Meta-learning curiosity algorithms
TLDR
This work proposes a strategy for encoding curiosity algorithms as programs in a domain-specific language and searching, during a meta-learning phase, for algorithms that enable RL agents to perform well in new domains. Expand
Hindsight Experience Replay
TLDR
A novel technique is presented which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering and may be seen as a form of implicit curriculum. Expand
Effective, interpretable algorithms for curiosity automatically discovered by evolutionary search
We take the hypothesis that curiosity is a mechanism found by evolution that encourages meaningful exploration early in an agent’s life in order to expose it to experiences that enable it to obtainExpand
Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning
TLDR
It is illustrated the computational efficiency of IMGEPs as these robotic experiments use a simple memory-based low-level policy representations and search algorithm, enabling the whole system to learn online and incrementally on a Raspberry Pi 3. Expand
A Curious Robot Learner for Interactive Goal-Babbling : Strategically Choosing What, How, When and from Whom to Learn. (Un robot curieux pour l'apprentissage actif par babillage d'objectifs : choisir de manière stratégique quoi, comment, quand et de qui apprendre)
TLDR
An intrinsically motivated active learner which learns how its actions can produce varied consequences or outcomes is built, which actively learns online by sampling data which it chooses by using severalsampling modes. Expand
Optimal Curiosity-Driven Modular Incremental Slow Feature Analysis
TLDR
This work theoretically shows that, using a model, called curiosity-driven modular incremental slow feature analysis, the agent on average will learn slow feature representations in order of increasing learning difficulty, under certain mild conditions. Expand
Competitive Experience Replay
TLDR
This work proposes a novel method called competitive experience replay, which efficiently supplements a sparse reward by placing learning in the context of an exploration competition between a pair of agents, creating a competitive game designed to drive exploration. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 35 REFERENCES
Continually adding self-invented problems to the repertoire: First experiments with POWERPLAY
TLDR
Self-delimiting recurrent neural network (SLIM RNN) is used as a general computational architecture to implement the system's solver and learns to become increasingly general problem solvers, continually adding new problem solving procedures to the growing repertoire, exhibiting interesting developmental stages. Expand
PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem
TLDR
This work focuses on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Expand
Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts
TLDR
It is pointed out how the fine arts can be formally understood as a consequence of the basic principle: given some subjective observer, great works of art and music yield observation histories exhibiting more novel, previously unknown compressibility/regularity/predictability than lesser works, thus deepening the observer’s understanding of the world and what is possible in it. Expand
Exploring the predictable
TLDR
This work studies an embedded active learner that can limit its predictions to almost arbitrary computable aspects of spatio-temporal events and constructs probabilistic algorithms that map event sequences to abstract internal representations (IRs), and predicts IRs from IRs computed earlier. Expand
Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010)
  • J. Schmidhuber
  • Computer Science
  • IEEE Transactions on Autonomous Mental Development
  • 2010
TLDR
This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown, but learnable algorithmic regularities. Expand
Self-Delimiting Neural Networks
TLDR
To apply AOPS to (possibly recurrent) neural networks (NNs) and to efficiently teach a SLIM NN to solve many tasks, each connection keeps a list of tasks it is used for, which may be efficiently updated during training. Expand
Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints
TLDR
This chapter argues that exploration in real-world sensorimotor spaces needs to be constrained and guided by several combined developmental mechanisms, in particular: sensorim motor primitives and embodiment, task space representations, maturational processes, and social guidance. Expand
Bias-Optimal Incremental Problem Solving
Given is a problem sequence and a probability distribution (the bias) on programs computing solution candidates. We present an optimally fast way of incrementally solving each task in the sequence.Expand
Artificial curiosity based on discovering novel algorithmic predictability through coevolution
  • J. Schmidhuber
  • Computer Science
  • Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406)
  • 1999
TLDR
A "curious" embedded agent that differs from previous explorers in the sense that it can limit its predictions to fairly arbitrary, computable aspects of event sequences and thus can explicitly ignore almost arbitrary unpredictable, random aspects. Expand
Optimal Ordered Problem Solver
TLDR
An efficient, recursive, backtracking-based way of implementing OOPS on realistic computers with limited storage is introduced, and experiments illustrate how OOPS can greatly profit from metalearning or metasearching, that is, searching for faster search procedures. Expand
...
1
2
3
4
...