Scaling Up Optimal Heuristic Search in Dec-POMDPs via Incremental Expansion

@inproceedings{Spaan2011ScalingUO,
  title={Scaling Up Optimal Heuristic Search in Dec-POMDPs via Incremental Expansion},
  author={Matthijs T. J. Spaan and Frans A. Oliehoek and Chris Amato},
  booktitle={IJCAI},
  year={2011}
}
Planning under uncertainty for multiagent systems can be formalized as a decentralized partially observable Markov decision process. We advance the state of the art for optimal solution of this model, building on the Multiagent A* heuristic search method. A key insight is that we can avoid the full expansion of a search node that generates a number of children that is doubly exponential in the node's depth. Instead, we incrementally expand the children only when a next child might have the… Expand
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs
This article presents the state-of-the-art in optimal solution methods for decentralized partially observable Markov decision processes (Dec-POMDPs), which are general models for collaborativeExpand
Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs
TLDR
This article presents the state-of-the-art in optimal solution methods for decentralized partially observable Markov decision processes (Dec-POMDPs), which are general models for collaborative multiagent planning under uncertainty, and presents extensive empirical results demonstrating that GMAA*-ICE, an algorithm that synthesizes these advances, can optimally solve Dec-PomDPs of unprecedented size. Expand
Scaling Up Decentralized MDPs Through Heuristic Search
TLDR
An updated proof that an optimal policy does not depend on the histories of the agents, but only the local observations is provided and a new algorithm based on heuristic search that is able to expand search nodes by using constraint optimization is presented. Expand
Heuristic search of multiagent influence space
TLDR
The logical albeit nontrivial next step of combining multiagent A* search and influence-based abstraction into a single algorithm is taken, and empirical results indicate that A* can provide significant computational savings on top of those already afforded by influence-space search. Expand
Heuristic Search of Multiagent Influence Space Citation
Two techniques have substantially advanced efficiency and scalability of multiagent planning. First, heuristic search gains traction by pruning large portions of the joint policy space. Second,Expand
Solving Multi-agent MDPs Optimally with Conditional Return Graphs
TLDR
This work proposes CoRe, a novel branch-and-bound policy search algorithm building on CRGs, which typically requires less runtime than the avail- able alternatives and is able to find solutions to problems previously considered unsolvable. Expand
Multi-Agent Planning under Uncertainty with Monte Carlo Q-Value Function
Decentralized partially observable Markov decision processes (Dec-POMDPs) are general multi-agent models for planning under uncertainty, but are intractable to solve. Doubly exponential growth of theExpand
Optimally Solving Dec-POMDPs as Continuous-State MDPs
TLDR
The idea of transforming a Dec-POMDP into a continuous-state deterministic MDP with a piecewise-linear and convex value function is introduced, and a feature-based heuristic search that relies on feature- based compact representations, point-based updates and efficient action selection is introduced. Expand
Accelerated Vector Pruning for Optimal POMDP Solvers
TLDR
This paper shows how the LPs in POMDP pruning subroutines can be decomposed using a Benders decomposition and shows that the resulting algorithm incrementally adds LP constraints and uses only a small fraction of the constraints. Expand
Producing efficient error-bounded solutions for transition independent decentralized mdps
TLDR
This paper presents the first approach for solving transition independent decentralized Markov decision processes (Dec-MDPs), that inherits error-bounds and fast convergence rates and provides the foundation for the first algorithm for solving infinite-horizon transitionindependent decentralized MDPs. Expand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 23 REFERENCES
Memory-Bounded Dynamic Programming for DEC-POMDPs
TLDR
This work presents the first memory-bounded dynamic programming algorithm for finite-horizon decentralized POMDPs, which can handle horizons that are multiple orders of magnitude larger than what was previously possible, while achieving the same or better solution quality. Expand
MAA*: A Heuristic Search Algorithm for Solving Decentralized POMDPs
TLDR
This work presents multi-agent A* (MAA*), the first complete and optimal heuristic search algorithm for solving decentralized partially-observable Markov decision problems (DECPOMDPs) with finite horizon, and introduces an anytime variant of MAA*. Expand
Incremental Policy Generation for Finite-Horizon DEC-POMDPs
TLDR
A new backup algorithm is proposed that is based on a reachability analysis of the state space that can be used to produce an optimal solution for any possible initial state or further scalability can be achieved by making use of a known start state. Expand
Lossless clustering of histories in decentralized POMDPs
TLDR
This work proves that when two histories satisfy the criterion, they have the same optimal value and thus can be treated as one, and demonstrates empirically that it can provide a speed-up of multiple orders of magnitude, allowing the optimal solution of significantly larger problems. Expand
Approximate solutions for partially observable stochastic games with common payoffs
TLDR
This work proposes an algorithm that approximates POSGs as a series of smaller, related Bayesian games, using heuristics such as QMDP to provide the future discounted value of actions, and results in policies that are locally optimal with respect to the selected heuristic. Expand
Online planning for multi-agent systems with bounded communication
TLDR
An online algorithm for planning under uncertainty in multi-agent settings modeled as DEC-POMDPs that can solve problems that are too large for the best existing offline planning algorithms and it outperforms the best online method, producing much higher value with much less communication in most cases. Expand
Heuristic search for identical payoff Bayesian games
TLDR
A branch and bound algorithm that optimally solves identical payoff Bayesian games for coordinating teams of cooperative agents and shows a marked improvement over previous methods, obtaining speedups of up to 3 orders of magnitude for synthetic random games, and reaching 10 order of magnitude speedups for games in a DEC-POMDP context. Expand
Point-based backup for decentralized POMDPs: complexity and new algorithms
TLDR
The optimal algorithm exploits recent advances in the weighted CSP literature to overcome the complexity of the backup operation and the polytime approximation scheme provides a constant factor approximation guarantee based on the number of belief points. Expand
Formal models and algorithms for decentralized decision making under uncertainty
TLDR
Five different formal frameworks, three different optimal algorithms, as well as a series of approximation techniques are analyzed to provide interesting insights into the structure of decentralized problems, the expressiveness of the various models, and the relative advantages and limitations of the different solution techniques. Expand
Learning Policies for Partially Observable Environments: Scaling Up
TLDR
This paper discusses several simple solution methods and shows that all are capable of finding near- optimal policies for a selection of extremely small POMDP'S taken from the learning literature, but shows that none are able to solve a slightly larger and noisier problem based on robot navigation. Expand
...
1
2
3
...