Diverse Agents for Ad-Hoc Cooperation in Hanabi

@article{Canaan2019DiverseAF,
  title={Diverse Agents for Ad-Hoc Cooperation in Hanabi},
  author={R. Canaan and J. Togelius and Andy Nealen and S. Menzel},
  journal={2019 IEEE Conference on Games (CoG)},
  year={2019},
  pages={1-8}
}
In complex scenarios where a model of other actors is necessary to predict and interpret their actions, it is often desirable that the model works well with a wide variety of previously unknown actors. Hanabi is a card game that brings the problem of modeling other players to the forefront, but there is no agreement on how to best generate a pool of agents to use as partners in ad-hoc cooperation evaluation. This paper proposes Quality Diversity algorithms as a promising class of algorithms to… Expand
Generating and Adapting to Diverse Ad-Hoc Cooperation Agents in Hanabi
TLDR
Quality Diversity algorithms are proposed as a promising class of algorithms to generate diverse populations for this purpose, and a population of diverse Hanabi agents is generated using MAP-Elites. Expand
Evaluating the Rainbow DQN Agent in Hanabi with Unseen Partners
TLDR
It is shown that agents trained through self-play using the popular RainbowDQN architecture fail to cooperate well with simple rule-based agents that were not seen during training and, conversely, if agents are trained to play with any individualRule-based agent, or even a mix of these agents, they fail to achieve good self- play scores. Expand
Evaluating RL Agents in Hanabi with Unseen Partners
Hanabi is a cooperative game that challenges existing AI techniques due to its focus on modeling the mental states of other players to interpret and predict their behavior. While there are agentsExpand
Learning Robust Helpful Behaviors in Two-Player Cooperative Atari Environments
TLDR
This work begins the study of helpful behavior in the setting of two-player Atari games, suitably modified to provide cooperative incentives, to understand whether reinforcement learning can be used to achieve robust, helpful behavior. Expand
Improving Policies via Search in Cooperative Partially Observable Games
TLDR
This paper proposes two different search techniques that can be applied to improve an arbitrary agreed-upon policy in a cooperative partially observable game and proves that these search procedures are theoretically guaranteed to at least maintain the original performance of the agreed-Upon policy (up to a bounded approximation error). Expand
Behavioral Evaluation of Hanabi Rainbow DQN Agents and Rule-Based Agents
TLDR
A key finding is that while most agents only learn to play well with partners seen during training, one particular agent leads the Rainbow algorithm towards a much more general policy. Expand
Investigating Partner Diversification Methods in Cooperative Multi-agent Deep Reinforcement Learning
TLDR
It is revealed that widely used methods such as partner sampling and population-based training are unreliable at introducing diversity under fully cooperative multi-agent Markov decision process and it is found that generating pre-trained partners is a simple yet effective procedure to achieve diversity. Expand
HOAD: The Hanabi Open Agent Dataset
TLDR
This work describes in detail an easy way to add new agents to HOAD regardless of the origin codebase of the agent and makes the code and dataset publicly available at https://github.com/aronsar/hoad. Expand
Learning with Generated Teammates to Achieve Type-Free Ad-Hoc Teamwork
In ad-hoc teamwork, an agent is required to cooperate with unknown teammates without prior coordination. To swiftly adapt to an unknown teammate, most works adopt a type-based approach, whichExpand
Simplified Action Decoder for Deep Multi-Agent Reinforcement Learning
TLDR
A new deep multi-agent RL method, the Simplified Action Decoder (SAD), which resolves this contradiction exploiting the centralized training phase and establishes a new SOTA for learning methods for 2-5 players on the self-play part of the Hanabi challenge. Expand
...
1
2
...

References

SHOWING 1-10 OF 27 REFERENCES
Evolving Agents for the Hanabi 2018 CIG Competition
TLDR
A genetic algorithm is developed that builds rule-based agents by determining the best sequence of rules from a fixed rule set to use as strategy and achieves scores superior to previously published research for the mirror and mixed evaluation of agents. Expand
Evaluating and modelling Hanabi-playing agents
TLDR
This paper implements a number of rule-based agents, both from the literature and of their own devising, in addition to an Information Set-Monte Carlo Tree Search (IS-MCTS) agent, and constructs a new, predictor version that uses a model of the agents with which it is paired. Expand
An intentional AI for hanabi
TLDR
This paper investigates one such game, the award-winning card game Hanabi, and presents an agent designed to play better with a human cooperator than these previous results by basing it on communication theory and psychology research. Expand
Ad Hoc Autonomous Agent Teams: Collaboration without Pre-Coordination
TLDR
This paper defines the concept of ad hoc team agents, specifies an evaluation paradigm, and provides examples of possible theoretical and empirical approaches to challenge to encourage progress towards this ambitious, newly realistic, and increasingly important research goal. Expand
How to Make the Perfect Fireworks Display: Two Strategies for Hanabi
Summary The game of Hanabi is a multiplayer cooperative card game that has many similarities to a mathematical “hat guessing game.” In Hanabi, a player does not see the cards in her own hand and mustExpand
Autonomous agents modelling other agents: A comprehensive survey and open problems
TLDR
The purpose of the present article is to provide a comprehensive survey of the salient modelling methods which can be found in the literature, and to discuss of open problems which may form the basis for fruitful future research. Expand
The Hanabi Challenge: A New Frontier for AI Research
TLDR
It is argued that Hanabi elevates reasoning about the beliefs and intentions of other agents to the foreground and developing novel techniques for such theory of mind reasoning will not only be crucial for success in Hanabi, but also in broader collaborative efforts, especially those with human partners. Expand
Solving Hanabi: Estimating Hands by Opponent's Actions in Cooperative Game with Incomplete Information
  • Hirotaka Osawa
  • Computer Science
  • AAAI Workshop: Computer Poker and Imperfect Information
  • 2015
TLDR
The results indicate that the strategy with feedbacks from simulated opponent's viewpoints achieves more score than other strategies. Expand
Re-determinizing Information Set Monte Carlo Tree Search in Hanabi
TLDR
Re-determinizing IS-MCTS is introduced, a novel extension of Information Set Monte Carlo Tree Search (IS-M CTS) that prevents a leakage of hidden information into opponent models that can occur in IS- MCTS, and is particularly severe in Hanabi. Expand
Towards Game-based Metrics for Computational Co-Creativity
TLDR
A mapping from modern electronic and tabletop games to open research problems in mixed-initiative systems and computational co-creativity and a number of metrics under which the performance of cooperative agents in these environments will be evaluated are proposed. Expand
...
1
2
3
...