## Figures from this paper

## 51 Citations

### The Minds of Many: Opponent Modeling in a Stochastic Game

- Computer ScienceIJCAI
- 2017

This paper introduces a stereotyping mechanism, which segments the agent population into sub-groups of agents with similar behaviour, which allows larger groups of agents to be modelled robustly and shows that Theory of Mind modelling is useful in many artificial intelligence applications.

### Learning against sequential opponents in repeated stochastic games

- Computer Science
- 2017

This article presents a formal model of sequential interactions, in which subsets from the player population are drawn sequentially to play a repeated stochastic game with an unknown (small) number of repetitions, and proposes a learning algorithm to act in these sequential interactions.

### Robust Stochastic Bayesian Games for Behavior Space Coverage

- Computer ScienceArXiv
- 2020

This work combines the optimality criteria of the Robust Markov Decision Process (RMDP) and the Stochastic Bayesian Game (SBG) to exponentially reduce the sample complexity for planning with hypothesis sets defined over continuous behavior spaces.

### Towards a Fast Detection of Opponents in Repeated Stochastic Games

- Computer ScienceAAMAS Workshops
- 2017

A formal model of sequential interactions of multiagent algorithms with repeated interaction with a fixed set of opponents and a corresponding algorithm that combines the two established frameworks Pepper and Bayesian policy reuse are presented.

### Interactive POMDPs with finite-state models of other agents

- Computer Science, EconomicsAutonomous Agents and Multi-Agent Systems
- 2016

This work proposes a special case of interactive partially observable Markov decision process, in which the agent does not explicitly model the other agents’ beliefs and preferences, and instead represents them as stochastic processes implemented by probabilistic deterministic finite state controllers (PDFCs).

### Reasoning about Hypothetical Agent Behaviours and their Parameters

- Computer ScienceAAMAS
- 2017

A general method which allows an agent to reason about both the relative likelihood of types and the values of any bounded continuous parameters within types, and maintains individual parameter estimates for each type and selectively updates the estimates for some types after each observation.

### A Survey of Learning in Multiagent Environments: Dealing with Non-Stationarity

- Computer ScienceArXiv
- 2017

This survey presents a coherent overview of work that addresses opponent-induced non-stationarity with tools from game theory, reinforcement learning and multi-armed bandits, arriving at a new framework and five categories (in increasing order of sophistication): ignore, forget, respond to target models, learn models, and theory of mind.

### Learning Best Response Strategies for Agents in Ad Exchanges

- Computer ScienceEUMAS
- 2018

The Harsanyi-Bellman Ad Hoc Coordination (HBA) algorithm is addressed, which conceptualises this interaction in terms of a Stochastic Bayesian Game and arrives at optimal actions by best responding with respect to probabilistic beliefs maintained over a candidate set of opponent behaviour profiles.

### Can bounded and self-interested agents be teammates? Application to planning in ad hoc teams

- Computer ScienceAutonomous Agents and Multi-Agent Systems
- 2016

It is demonstrated that an implication of bounded, finitely-nested reasoning by a self-interested agent is that it may not obtain optimal team solutions in cooperative settings, if it is part of a team, and this limitation is addressed by including models at level 0 whose solutions involve reinforcement learning.

### Deep Interactive Bayesian Reinforcement Learning via Meta-Learning

- Computer ScienceAAMAS
- 2021

This work proposes to meta-learn approximate belief inference and Bayes-optimal behaviour for a given prior, and shows empirically that this approach outperforms existing methods that use a model-free approach, sample from the approximate posterior, maintain memory-free models of others, or do not fully utilise the known structure of the environment.

## References

SHOWING 1-10 OF 108 REFERENCES

### A Framework for Sequential Planning in Multi-Agent Settings

- EconomicsAI&M
- 2004

This paper extends the framework of partially observable Markov decision processes (POMDPs) to multi-agent settings by incorporating the notion of agent models into the state space and expresses the agents' autonomy by postulating that their models are not directly manipulable or observable by other agents.

### On the impossibility of predicting the behavior of rational agents

- EconomicsProceedings of the National Academy of Sciences of the United States of America
- 2001

It is concluded that there are strategic situations in which it is impossible in principle for perfectly rational agents to learn to predict the future behavior of other perfectly rational Agents based solely on their observed actions.

### An Empirical Study on the Practical Impact of Prior Beliefs over Policy Types

- EconomicsAAAI
- 2015

It is shown that prior beliefs can have a significant impact on the long-term performance of such methods, and that the magnitude of the impact depends on the depth of the planning horizon.

### Are You Doing What I Think You Are Doing? Criticising Uncertain Agent Models

- Computer ScienceUAI
- 2015

A novel algorithm is presented which decides this question in the form of a frequentist hypothesis test, which allows for multiple metrics in the construction of the test statistic and learns its distribution during the interaction process, with asymptotic correctness guarantees.

### Bayesian learning and convergence to Nash equilibria without common priors

- Economics
- 1998

Summary. Consider an infinitely repeated game where each player is characterized by a “type” which may be unknown to the other players in the game. Suppose further that each player's belief about…

### On Convergence and Optimality of Best-Response Learning with Policy Types in Multiagent Systems

- Computer ScienceUAI
- 2014

A novel characterisation of optimality is provided which allows experts to use efficient model checking algorithms to verify optimality of types, and a new posterior which can learn correlated distributions is proposed.

### Games with Incomplete Information Played by "Bayesian" Players, I-III: Part I. The Basic Model&

- EconomicsManag. Sci.
- 2004

The paper develops a new theory for the analysis of games with incomplete information where the players are uncertain about some important parameters of the game situation, such as the payoff functions, the strategies available to various players, the information other players have about the game, etc.

### Bayes? Bluff: Opponent Modelling in Poker

- Computer ScienceUAI
- 2005

This paper presents a Bayesian probabilistic model for a broad class of poker games, separating the uncertainty in the game dynamics from the uncertainty of the opponent's strategy.

### A game-theoretic model and best-response learning method for ad hoc coordination in multiagent systems

- Computer ScienceAAMAS
- 2013

This work conceptualises the ad hoc coordination problem formally as a stochastic Bayesian game in which the behaviour of a player is determined by its type, and derives a solution, called Harsanyi-Bellman Ad Hoc Coordination (HBA), which utilises a set of user-defined types to characterise players based on their observed behaviours.