• Publications
  • Influence
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
TLDR
This paper takes a big picture look at how the ALE is being used by the research community and focuses on how diverse the evaluation methodologies in the ALE have become and highlights some key concerns when evaluating agents in this platform.
A Laplacian Framework for Option Discovery in Reinforcement Learning
TLDR
This paper addresses the option discovery problem by showing how PVFs implicitly define options by introducing eigenpurposes, intrinsic reward functions derived from the learned representations, which traverse the principal directions of the state space.
Count-Based Exploration with the Successor Representation
TLDR
A simple approach for exploration in reinforcement learning (RL) that allows us to develop theoretically justified algorithms in the tabular case but that is also extendable to settings where function approximation is required and achieves state-of-the-art performance in Atari 2600 games when in a low sample-complexity regime.
Eigenoption Discovery through the Deep Successor Representation
TLDR
This paper proposes an algorithm that discovers eigenoptions while learning non-linear state representations from raw pixels, and exploits recent successes in the deep reinforcement learning literature and the equivalence between proto-value functions and the successor representation.
State of the Art Control of Atari Games Using Shallow Reinforcement Learning
TLDR
This paper systematically evaluates the importance of key representational biases encoded by DQN's network by proposing simple linear representations that make use of these concepts, and obtains a computationally practical feature set that achieves competitive performance to D QN in the ALE.
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
TLDR
A theoretically motivated policy similarity metric (PSM) for measuring behavioral similarity between states is introduced and it is demonstrated that PSEs improve generalization on diverse benchmarks, including LQR with spurious correlations, a jumping task from pixels, and Distracting DM Control Suite.
Generalization and Regularization in DQN
TLDR
Despite regularization being largely underutilized in deep RL, it is shown that it can, in fact, help DQN learn more general features and can then be reused and fine-tuned on similar tasks, considerably improving the sample efficiency of D QN.
Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
TLDR
A big picture look at how the Arcade Learning Environment is being used by the research community is taken, revisiting challenges posed when the ALE was introduced, summarizing the state-of-the-art in various problems and highlighting problems that remain open.
True Online Temporal-Difference Learning
TLDR
This article compares the performance of true online TD ($\ lambda$)/Sarsa ($\lambda$) with regular TD($\ Lambda$/SarsA) on random MRPs, a real-world myoelectric prosthetic arm, and a domain from the Arcade Learning Environment, and suggests that the true online methods indeed dominate the regular methods.
On Bonus Based Exploration Methods In The Arcade Learning Environment
TLDR
The results suggest that recent gains in Montezuma's Revenge may be better attributed to architecture change, rather than better exploration schemes; and that the real pace of progress in exploration research for Atari 2600 games may have been obfuscated by good results on a single domain.
...
...