Corpus ID: 231942312

Fully General Online Imitation Learning

@article{Cohen2021FullyGO,
  title={Fully General Online Imitation Learning},
  author={Michael K. Cohen and Marcus Hutter and Neel Nanda},
  journal={ArXiv},
  year={2021},
  volume={abs/2102.08686}
}
In imitation learning, imitators and demonstrators are policies for picking actions given past interactions with the environment. If we run an imitator, we probably want events to unfold similarly to the way they would have if the demonstrator had been acting the whole time. No existing work provides formal guidance in how this might be accomplished, instead restricting focus to environments that restart, making learning unusually easy, and conveniently limiting the significance of any mistake… Expand

References

SHOWING 1-10 OF 13 REFERENCES
Efficient Reductions for Imitation Learning
  • 386
  • PDF
A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning
  • 1,409
  • PDF
Pessimism About Unknown Unknowns Inspires Conservatism
  • 3
  • PDF
Asymptotics of discrete MDL for online prediction
  • 33
  • PDF
A Reduction from Apprenticeship Learning to Classification
  • 50
  • PDF
Risk-Aware Active Inverse Reinforcement Learning
  • 25
  • PDF
Asymptotically Unambitious Artificial General Intelligence
  • 7
  • PDF
Language Identification in the Limit
  • 3,520
  • Highly Influential
On the generalization ability of on-line learning algorithms
  • 475
  • PDF
Universal Artificial Intellegence - Sequential Decisions Based on Algorithmic Probability
  • Marcus Hutter
  • Computer Science
  • Texts in Theoretical Computer Science. An EATCS Series
  • 2004
  • 291
  • Highly Influential
  • PDF
...
1
2
...