# First Experiments with PowerPlay

@article{Srivastava2013FirstEW, title={First Experiments with PowerPlay}, author={Rupesh Kumar Srivastava and Bas R. Steunebrink and J{\"u}rgen Schmidhuber}, journal={Neural networks : the official journal of the International Neural Network Society}, year={2013}, volume={41}, pages={ 130-6 } }

Like a scientist or a playing child, POWERPLAY (Schmidhuber, 2011) not only learns new skills to solve given problems, but also invents new interesting problems by itself. By design, it continually comes up with the fastest to find, initially novel, but eventually solvable tasks. It also continually simplifies or compresses or speeds up solutions to previous tasks. Here we describe first experiments with POWERPLAY. A self-delimiting recurrent neural network SLIM RNN (Schmidhuber, 2012) is used… Expand

#### 47 Citations

PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem

- Computer Science, Medicine
- Front. Psychol.
- 2013

This work focuses on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Expand

One Big Net For Everything

- Computer Science
- ArXiv
- 2018

The incremental training of an increasingly general problem solver, continually learning to solve new tasks without forgetting previous skills is applied, to greatly speed up subsequent learning of additional, novel but algorithmically related skills. Expand

C OMPETITIVE EXPERIENCE REPLAY

- 2019

Deep learning has achieved remarkable successes in solving challenging reinforcement learning (RL) problems. However, it still often suffers from the need to engineer a reward function that not only… Expand

Meta-learning curiosity algorithms

- Computer Science, Mathematics
- ICLR
- 2020

This work proposes a strategy for encoding curiosity algorithms as programs in a domain-specific language and searching, during a meta-learning phase, for algorithms that enable RL agents to perform well in new domains. Expand

Hindsight Experience Replay

- Computer Science, Mathematics
- NIPS
- 2017

A novel technique is presented which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering and may be seen as a form of implicit curriculum. Expand

Effective, interpretable algorithms for curiosity automatically discovered by evolutionary search

- 2020

We take the hypothesis that curiosity is a mechanism found by evolution that encourages meaningful exploration early in an agent’s life in order to expose it to experiences that enable it to obtain… Expand

Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning

- Computer Science
- ArXiv
- 2017

It is illustrated the computational efficiency of IMGEPs as these robotic experiments use a simple memory-based low-level policy representations and search algorithm, enabling the whole system to learn online and incrementally on a Raspberry Pi 3. Expand

A Curious Robot Learner for Interactive Goal-Babbling : Strategically Choosing What, How, When and from Whom to Learn. (Un robot curieux pour l'apprentissage actif par babillage d'objectifs : choisir de manière stratégique quoi, comment, quand et de qui apprendre)

- Psychology, Philosophy
- 2013

An intrinsically motivated active learner which learns how its actions can produce varied consequences or outcomes is built, which actively learns online by sampling data which it chooses by using severalsampling modes. Expand

Optimal Curiosity-Driven Modular Incremental Slow Feature Analysis

- Computer Science, Medicine
- Neural Computation
- 2016

This work theoretically shows that, using a model, called curiosity-driven modular incremental slow feature analysis, the agent on average will learn slow feature representations in order of increasing learning difficulty, under certain mild conditions. Expand

Competitive Experience Replay

- Computer Science, Mathematics
- ICLR
- 2019

This work proposes a novel method called competitive experience replay, which efficiently supplements a sparse reward by placing learning in the context of an exploration competition between a pair of agents, creating a competitive game designed to drive exploration. Expand

#### References

SHOWING 1-10 OF 35 REFERENCES

Continually adding self-invented problems to the repertoire: First experiments with POWERPLAY

- Computer Science
- 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL)
- 2012

Self-delimiting recurrent neural network (SLIM RNN) is used as a general computational architecture to implement the system's solver and learns to become increasingly general problem solvers, continually adding new problem solving procedures to the growing repertoire, exhibiting interesting developmental stages. Expand

PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem

- Computer Science, Medicine
- Front. Psychol.
- 2013

This work focuses on automatically inventing or discovering problems in a way inspired by the playful behavior of animals and humans, to train a more and more general problem solver from scratch in an unsupervised fashion. Expand

Developmental robotics, optimal artificial curiosity, creativity, music, and the fine arts

- Computer Science
- Connect. Sci.
- 2006

It is pointed out how the fine arts can be formally understood as a consequence of the basic principle: given some subjective observer, great works of art and music yield observation histories exhibiting more novel, previously unknown compressibility/regularity/predictability than lesser works, thus deepening the observer’s understanding of the world and what is possible in it. Expand

Exploring the predictable

- Computer Science
- 2003

This work studies an embedded active learner that can limit its predictions to almost arbitrary computable aspects of spatio-temporal events and constructs probabilistic algorithms that map event sequences to abstract internal representations (IRs), and predicts IRs from IRs computed earlier. Expand

Formal Theory of Creativity, Fun, and Intrinsic Motivation (1990–2010)

- Computer Science
- IEEE Transactions on Autonomous Mental Development
- 2010

This overview first describes theoretically optimal (but not necessarily practical) ways of implementing the basic computational principles on exploratory, intrinsically motivated agents or robots, encouraging them to provoke event sequences exhibiting previously unknown, but learnable algorithmic regularities. Expand

Self-Delimiting Neural Networks

- Computer Science
- ArXiv
- 2012

To apply AOPS to (possibly recurrent) neural networks (NNs) and to efficiently teach a SLIM NN to solve many tasks, each connection keeps a list of tasks it is used for, which may be efficiently updated during training. Expand

Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints

- Psychology, Computer Science
- Intrinsically Motivated Learning in Natural and Artificial Systems
- 2013

This chapter argues that exploration in real-world sensorimotor spaces needs to be constrained and guided by several combined developmental mechanisms, in particular: sensorim motor primitives and embodiment, task space representations, maturational processes, and social guidance. Expand

Bias-Optimal Incremental Problem Solving

- Computer Science
- NIPS
- 2002

Given is a problem sequence and a probability distribution (the bias) on programs computing solution candidates. We present an optimally fast way of incrementally solving each task in the sequence.… Expand

Artificial curiosity based on discovering novel algorithmic predictability through coevolution

- Computer Science
- Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406)
- 1999

A "curious" embedded agent that differs from previous explorers in the sense that it can limit its predictions to fairly arbitrary, computable aspects of event sequences and thus can explicitly ignore almost arbitrary unpredictable, random aspects. Expand

Optimal Ordered Problem Solver

- Computer Science
- Machine Learning
- 2004

An efficient, recursive, backtracking-based way of implementing OOPS on realistic computers with limited storage is introduced, and experiments illustrate how OOPS can greatly profit from metalearning or metasearching, that is, searching for faster search procedures. Expand