# Lossless clustering of histories in decentralized POMDPs

@inproceedings{Oliehoek2009LosslessCO, title={Lossless clustering of histories in decentralized POMDPs}, author={F. Oliehoek and S. Whiteson and M. Spaan}, booktitle={AAMAS}, year={2009} }

Decentralized partially observable Markov decision processes (Dec-POMDPs) constitute a generic and expressive framework for multiagent planning under uncertainty. However, planning optimally is difficult because solutions map local observation histories to actions, and the number of such histories grows exponentially in the planning horizon. In this work, we identify a criterion that allows for lossless clustering of observation histories: i.e., we prove that when two histories satisfy the… Expand

#### 45 Citations

Optimally Solving Dec-POMDPs as Continuous-State MDPs

- Mathematics, Computer Science
- IJCAI
- 2013

The idea of transforming a Dec-POMDP into a continuous-state deterministic MDP with a piecewise-linear and convex value function is introduced, and a feature-based heuristic search that relies on feature- based compact representations, point-based updates and efficient action selection is introduced. Expand

Exploiting separability in multiagent planning with continuous-state MDPs

- Computer Science, Mathematics
- AAMAS
- 2014

It is shown that, under certain separability conditions of the optimal value function, the scalability of this approach can increase considerably, and the novel continuous-state MDP algorithm retains optimality and convergence guarantees. Expand

Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

- Mathematics
- 2013

This article presents the state-of-the-art in optimal solution methods for decentralized partially observable Markov decision processes (Dec-POMDPs), which are general models for collaborative… Expand

Decentralized POMDPs

- Computer Science
- Reinforcement Learning
- 2012

This chapter presents an overview of the decentralized POMDP (Dec-POMDP) framework, and covers the forward heuristic search approach to solving Dec-PomDPs, as well as the backward dynamic programming approach. Expand

Informed Initial Policies for Learning in Finite Horizon Dec-POMDPs

- 2012

Decentralized partially observable Markov decision processes (Dec-POMDPs) offer a formal model for planning in cooperative multiagent systems where agents operate with noisy sensors and actuators,… Expand

Sufficient Plan-Time Statistics for Decentralized POMDPs

- Computer Science
- IJCAI
- 2013

This paper makes a contribution to the theory of decentralized POMDPs by showing how this dependence on the 'past joint policy' can be replaced by a sufficient statistic, and the results are extended to the case of k-step delayed communication. Expand

Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs

- Mathematics, Computer Science
- J. Artif. Intell. Res.
- 2013

This article presents the state-of-the-art in optimal solution methods for decentralized partially observable Markov decision processes (Dec-POMDPs), which are general models for collaborative multiagent planning under uncertainty, and presents extensive empirical results demonstrating that GMAA*-ICE, an algorithm that synthesizes these advances, can optimally solve Dec-PomDPs of unprecedented size. Expand

Scaling Up Optimal Heuristic Search in Dec-POMDPs via Incremental Expansion

- Mathematics, Computer Science
- IJCAI
- 2011

A key insight is that one can avoid the full expansion of a search node that generates a number of children that is doubly exponential in the node's depth, allowing for optimal solutions over longer horizons for many benchmark problems. Expand

Point-based policy generation for decentralized POMDPs

- Computer Science
- AAMAS
- 2010

A new algorithm, Point-Based Policy Generation, which avoids altogether searching the entire joint policy space and ensures that the best joint policy for each reachable belief state can be constructed directly, instead of producing first a large set of candidates. Expand

Modeling and Planning with Macro-Actions in Decentralized POMDPs

- Computer Science, Medicine
- J. Artif. Intell. Res.
- 2019

This work extends three leading Dec-POMDP algorithms for policy generation to the macro-action case, and can synthesize control policies that exploit opportunities for coordination while balancing uncertainty, sensor information, and information about other agents. Expand

#### References

SHOWING 1-10 OF 18 REFERENCES

Memory-Bounded Dynamic Programming for DEC-POMDPs

- Mathematics, Computer Science
- IJCAI
- 2007

This work presents the first memory-bounded dynamic programming algorithm for finite-horizon decentralized POMDPs, which can handle horizons that are multiple orders of magnitude larger than what was previously possible, while achieving the same or better solution quality. Expand

Optimal Fixed-Size Controllers for Decentralized POMDPs

- 2006

Solving decentralized partially observable Markov decision processes (DEC-POMDPs) is a difficult task. Exact solutions are intractable in all but the smallest problems and approximate solutions… Expand

Value-based observation compression for DEC-POMDPs

- Computer Science
- AAMAS
- 2008

This paper introduces a value-based observation compression technique that prunes the least valuable observations while maintaining an error bound on the value lost as a result of pruning. Expand

Optimal and Approximate Q-value Functions for Decentralized POMDPs

- Computer Science
- J. Artif. Intell. Res.
- 2008

This paper studies whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions, and describes a family of algorithms for extracting policies from such Q- value functions. Expand

Approximate solutions for partially observable stochastic games with common payoffs

- Computer Science
- Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems, 2004. AAMAS 2004.
- 2004

This work proposes an algorithm that approximates POSGs as a series of smaller, related Bayesian games, using heuristics such as QMDP to provide the future discounted value of actions, and results in policies that are locally optimal with respect to the selected heuristic. Expand

Exact Dynamic Programming for Decentralized POMDPs with Lossless Policy Compression

- Computer Science, Mathematics
- ICAPS
- 2008

A new method for solving DEC-POMDP problems, based on the compression of the policy belief space, which contains sequences of actions and observations that are linearly independent. Expand

Optimizing Memory-Bounded Controllers for Decentralized POMDPs

- Mathematics, Computer Science
- UAI
- 2007

The experimental results show that nonlinear optimization can be used to provide high quality, concise solutions to decentralized decision problems under uncertainty. Expand

Interaction-driven Markov games for decentralized multiagent planning under uncertainty

- Computer Science
- AAMAS
- 2008

A fast approximate solution method for planning in IDMGs is introduced, exploiting their particular structure, and its successful application on several large multiagent tasks is illustrated. Expand

MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs

- Computer Science
- UAI
- 2005

This work presents multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DECPOMDPs) with finite horizon, and introduces an anytime variant of MAA*. Expand

Mixed Integer Linear Programming for Exact Finite-Horizon Planning in Decentralized Pomdps

- Computer Science, Mathematics
- ICAPS
- 2007

This paper represents each agent's policy in the sequence-form and not in the tree-form, thereby obtaining a very compact representation of the set of joint-policies and solves the problem of finding an n-agent joint-policy for the optimal finite-horizon control of a decentralized Pomdp. Expand