Share This Author
The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems
This work distinguishes reinforcement learners that are unaware of (or ignore) the presence of other agents from those that explicitly attempt to learn the value of joint actions and the strategies of their counterparts, and proposes alternative optimistic exploration strategies that increase the likelihood of convergence to an optimal equilibrium.
Decision-Theoretic Planning: Structural Assumptions and Computational Leverage
This paper presents an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI, and describes structural properties of M DPs that, when exhibited by particular classes of problems, can be exploited in the construction of optimal or approximately optimal policies or plans.
CP-nets: A Tool for Representing and Reasoning withConditional Ceteris Paribus Preference Statements
- Craig Boutilier, R. Brafman, C. Domshlak, H. Hoos, D. Poole
- Economics, Computer ScienceJ. Artif. Intell. Res.
- 30 June 2011
This paper proposes a qualitative graphical representation of preferences that reflects conditional dependence and independence of preference statements under a ceteris paribus (all else being equal) interpretation, and provides a formal semantics for this model.
CP-nets: a tool for represent-ing and reasoning with conditional ceteris paribus state-ments
Context-Specific Independence in Bayesian Networks
This paper proposes a formal notion of context-specific independence (CSI), based on regularities in the conditional probability tables (CPTs) at a node, and proposes a technique, analogous to (and based on) d-separation, for determining when such independence holds in a given network.
Planning, Learning and Coordination in Multiagent Decision Processes
- Craig Boutilier
- Computer ScienceTARK
- 17 March 1996
The extent to which methods from single-agent planning and learning can be applied in multiagent settings is investigated and the decomposition of sequential decision processes so that coordination can be learned locally, at the level of individual states.
SPUDD: Stochastic Planning using Decision Diagrams
This work proposes and examines a new value iteration algorithm for MDPs that uses algebraic decision diagrams (ADDs) to represent value functions and policies, assuming an ADD input representation of the MDP.
Stochastic dynamic programming with factored representations
Sequential Optimality and Coordination in Multiagent Systems
- Craig Boutilier
- Computer ScienceIJCAI
- 31 July 1999
This work proposes an extension of value iteration in which the system's state space is augmented with the state of the coordination mechanism adopted, allowing agents to reason about the short and long term prospects for coordination, the long term consequences of (mis)coordination, and make decisions to engage or avoid coordination problems based on expected value.
Budgeted Social Choice: From Consensus to Personalized Decision Making
This work develops a general framework for social choice problems in which a limited number of alternatives can be recommended to an agent population, and generalizes certain multiwinner election schemes.