TAMER: Training an Agent Manually via Evaluative Reinforcement

@article{Knox2008TAMERTA,
  title={TAMER: Training an Agent Manually via Evaluative Reinforcement},
  author={W. Bradley Knox and P. Stone},
  journal={2008 7th IEEE International Conference on Development and Learning},
  year={2008},
  pages={292-297}
}
  • W. B. Knox, P. Stone
  • Published 2008
  • Computer Science
  • 2008 7th IEEE International Conference on Development and Learning
Though computers have surpassed humans at many tasks, especially computationally intensive ones, there are many tasks for which human expertise remains necessary and/or useful. For such tasks, it is desirable for a human to be able to transmit knowledge to a learning agent as quickly and effortlessly as possible, and, ideally, without any knowledge of the details of the agentpsilas learning process. This paper proposes a general framework called Training an Agent Manually via Evaluative… Expand
Combining manual feedback with subsequent MDP reward signals for reinforcement learning
TLDR
The fast learning exhibited within the tamer framework is leveraged to hasten a reinforcement learning (RL) algorithm's climb up the learning curve, effectively demonstrating that human reinforcement and MDP reward can be used in conjunction with one another by an autonomous agent. Expand
DQN-TAMER: Human-in-the-Loop Reinforcement Learning with Intractable Feedback
TLDR
This work demonstrates a real-world human-in-the-loop RL application where a camera automatically recognizes a user's facial expressions as feedback to the agent while the agent explores a maze and proposes an RL method called DQN-TAMER, which efficiently uses both human feedback and distant rewards. Expand
Training a Tetris agent via interactive shaping: a demonstration of the TAMER framework
TLDR
This work treats shaping as a specific mode of knowledge transfer, distinct from (and probably complementary to) other natural methods of communication, including programming by demonstration and advice-giving, which is to create agents that can be shaped effectively. Expand
Deep Reinforcement Learning from Policy-Dependent Human Feedback
TLDR
The effectiveness of the Deep COACH algorithm is demonstrated in the rich 3D world of Minecraft with an agent that learns to complete tasks by mapping from raw pixels to actions using only real-time human feedback in 10-15 minutes of interaction. Expand
Learning from feedback on actions past and intended
Robotic learning promises to eventually provide great societal benefits. In contrast to pure trial-and-error learning, human instruction has at least two benefits: (1) Human teaching can lead to muchExpand
Mutual Reinforcement Learning
TLDR
A shared cognitive model is obtained which not only improves human cognition but enhances the robot's cognitive strategy to understand the mental model of its human partners while building a successful robot-human collaborative framework. Expand
Design Principles for Creating Human-Shapable Agents
TLDR
A framework that allows a human to train a learning agent by giving simple scalar reinforcement 1 signals while observing the agent perform the task is described and a set of conjectures about aspects of human teaching behavior are proposed that could be incorporated into future work on HT agents. Expand
Learning via human feedback in continuous state and action spaces
TLDR
An extension of TAMER to allow both continuous states and actions, called ACTAMER, is proposed, which utilizes any general function approximation of a human trainer’s feedback signal. Expand
Reinforcement learning combined with human feedback in continuous state and action spaces
  • Ngo Anh Vien, W. Ertel
  • Computer Science
  • 2012 IEEE International Conference on Development and Learning and Epigenetic Robotics (ICDL)
  • 2012
TLDR
The new framework extends the original TAMER to allow using any general function approximation of a human trainer's reinforcement signal, and investigates a combination capability of the ACTAMER and reinforcement learning (RL). Expand
Multi-Channel Interactive Reinforcement Learning for Sequential Tasks
TLDR
The experimental evaluations show that the approach can successfully incorporate human input to accelerate the learning process in both robotic tasks even if it is partially wrong, and can be beneficial for the future design of algorithms and interfaces of interactive reinforcement learning systems used by inexperienced users. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 20 REFERENCES
Apprenticeship learning via inverse reinforcement learning
TLDR
This work thinks of the expert as trying to maximize a reward function that is expressible as a linear combination of known features, and gives an algorithm for learning the task demonstrated by the expert, based on using "inverse reinforcement learning" to try to recover the unknown reward function. Expand
Reinforcement Learning with Human Teachers: Evidence of Feedback and Guidance with Implications for Learning Performance
TLDR
The importance of understanding the human-teacher/robot-learner system as a whole in order to design algorithms that support how people want to teach while simultaneously improving the robot's learning performance is demonstrated. Expand
Creating Advice-Taking Reinforcement Learners
TLDR
This work presents and evaluates a design that addresses this shortcoming by allowing a connectionist Q-learner to accept advice given, at any time and in a natural manner, by an external observer, and shows that, given good advice, a learner can achieve statistically significant gains in expected reward. Expand
Reinforcement Learning: An Introduction
TLDR
This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications. Expand
Integrated learning for interactive synthetic characters
TLDR
An autonomous animated dog is built that can be trained with a technique used to train real dogs called "clicker training" and capabilities demonstrated include being trained to recognize and use acoustic patterns as cues for actions, as well as to synthesize new actions from novel paths through its motion space. Expand
Cobot in LambdaMOO: An Adaptive Social Statistics Agent
TLDR
Cobot, a novel software agent who lives in LambdaMOO, a popular virtual world frequented by hundreds of users, uses reinforcement learning to proactively take action in this complex social environment, and adapts his behavior based on multiple sources of human reward. Expand
TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play
TLDR
The latest version of TD-Gammon is now estimated to play at a strong master level that is extremely close to the world's best human players. Expand
A simulation-theory inspired social learning system for interactive characters
TLDR
Max T. Mouse is presented, an anthropomorphic animated mouse character who uses his own motor and action representations to interpret the behaviors he sees his friend Morris Mouse performing (a process known as simulation theory in the cognitive literature). Expand
Reinforcement Learning for RoboCup Soccer Keepaway
TLDR
The application of episodic SMDP Sarsa(λ) with linear tile-coding function approximation and variable λ to learning higher-level decisions in a keepaway subtask of RoboCup soccer results in agents that significantly outperform a range of benchmark policies. Expand
Using Prior Knowledge to Improve Reinforcement Learning in Mobile Robotics
Reinforcement learning (RL) is thought to be an appropriate paradigm for acquiring control policies in mobile robotics. However, in its standard formulation (tabula rasa) RL must explore and learnExpand
...
1
2
...