Scaling life-long off-policy learning

@article{White2012ScalingLO,
  title={Scaling life-long off-policy learning},
  author={Adam White and Joseph Modayil and R. Sutton},
  journal={2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL)},
  year={2012},
  pages={1-6}
}
  • Adam White, Joseph Modayil, R. Sutton
  • Published 2012
  • Computer Science
  • 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL)
  • In this paper we pursue an approach to scaling life-long learning using parallel off-policy reinforcement learning algorithms. In life-long learning a robot continually learns from a life-time of experience, slowly acquiring and applying skills and knowledge to new situations. Many of the benefits of life-long learning are a results of scaling the amount of training data, processed by the robot, to long sensorimotor streams. Another dimension of scaling can be added by allowing off-policy… CONTINUE READING

    Figures and Topics from this paper.

    Multi-timescale nexting in a reinforcement learning robot
    87
    The Online Coupon-Collector Problem and Its Application to Lifelong Reinforcement Learning
    Experience Replay Using Transition Sequences
    3
    Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study
    8
    Prediction Driven Behavior: Learning Predictions that Drive Fixed Responses
    9

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 30 REFERENCES
    The Fixed Points of Off-Policy TD
    29
    Multi-timescale nexting in a reinforcement learning robot
    87
    Policy search for motor primitives in robotics
    288
    Off-policy Learning with Recognizers
    9
    Gradient temporal-difference learning algorithms
    112
    Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning
    2106
    Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction
    290
    Lifelong robot learning
    275
    Robot Learning From Demonstration
    633
    Reinforcement Learning: An Introduction
    24953