Incremental plan aggregation for generating policies in MDPs

@inproceedings{TeichteilKnigsbuch2010IncrementalPA,
  title={Incremental plan aggregation for generating policies in MDPs},
  author={F. Teichteil-K{\"o}nigsbuch and U. Kuter and G. Infantes},
  booktitle={AAMAS},
  year={2010}
}
Despite the recent advances in planning with MDPs, the problem of generating good policies is still hard. This paper describes a way to generate policies in MDPs by (1) determinizing the given MDP model into a classical planning problem; (2) building partial policies off-line by producing solution plans to the classical planning problem and incrementally aggregating them into a policy, and (3) using sequential Monte-Carlo (MC) simulations of the partial policies before execution, in order to… Expand
Anytime optimal MDP planning with trial-based heuristic tree search
TLDR
The main contribution of this thesis, the Trial-based Heuristic Tree Search (THTS) framework, allows the description of algorithms in terms of only six ingredients that can be mixed and matched at will, and shows theoretically and empirically that all proposed algorithms improve significantly over the application of the state-of-the-art approach. Expand
Probabilistic Planning with Reduced Models
  • L. Pineda
  • Computer Science, Mathematics
  • AAAI
  • 2014
TLDR
This framework places recent work on MDP determinization in a broader context and lays the foundation for efficient and systematic exploration of the space of MDP model reductions. Expand
A Robotic Execution Framework for Online Probabilistic (Re)Planning
TLDR
This work proposes an intermediate approach, which generates incomplete partial policies taking into account mid-term probabilistic uncertainties, continually improving them on a gliding horizon or regenerating them when they fail. Expand
Extending Classical Planning Heuristics to Probabilistic Planning with Dead-Ends
TLDR
This work applies the theoretical analysis to the well-known classical planning heuristics hmax and hadd, and proves that the extended hmax is admissible, and extends the hadd to solve more problems than non-discounted ones, with better criteria values. Expand
Plan aggregation for strong cyclic planning in nondeterministic domains
We describe a planning algorithm, NDP2, that finds strong-cyclic solutions to nondeterministic planning problems by using a classical planner to solve a sequence of classical planning problems. NDP2Expand
Generalizing the Role of Determinization in Probabilistic Planning
TLDR
This work shows that learning a good determinization for a planning domain can be done efficiently and can improve performance, and introduces a planner, FF-LAO*, that outperforms state-of-the-art probabilistic planners on several well-known competition benchmarks. Expand
Incremental contingency planning for recovering from critical outcomes in high-probability seed plans
TLDR
This work describes approaches to generating high-probability seed plans and to incremental contingency planning on the critical outcomes, which focuses on repairing critical outcomes that cannot be fixed by runtime replanning. Expand
Revisiting Goal Probability Analysis in Probabilistic Planning
TLDR
This work designs suitable termination criteria, search algorithm variants, dead-end pruning methods using classical planning heuristics, and node selection strategies, and explores the general case where there may be 0-reward cycles, and the practically relevant special case of acyclic planning. Expand
Revisiting Goal Probability Analysis in Probabilistic Planning Technical Report
Maximizing goal probability is an important objective in probabilistic planning, yet algorithms for its optimal solution are severely underexplored. There is scant evidence of what the empiricalExpand
Contingent Planning Using Counter-Examples from a Conformant Planner
Decision-making for autonomous robots in real world applications has to manage uncertainties in order to efficiently accomplish a mission. Some planning methods deal with uncertainty by improving theExpand
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 21 REFERENCES
FF + FPG: Guiding a Policy-Gradient Planner
TLDR
This paper shows how to use an external teacher to guide FPG's exploration using the actions suggested by FF's heuristic (Hoffmann 2001), as FF-replan has proved efficient for probabilistic re-planning. Expand
Abstraction and Approximate Decision-Theoretic Planning
TLDR
An abstraction technique for MDPs that allows approximately optimal solutions to be computed quickly and described methods by which the abstract solution can be viewed as a set of default reactions that can be improved incrementally, and used as a heuristic for search-based planning or other MDP methods. Expand
FF-Replan: A Baseline for Probabilistic Planning
TLDR
This paper gives the first technical description of FF-Replan and provides an analysis of its results on all of the recent IPPC-04 andIPPC-06 domains, in the hope that this will inspire extensions and insight into the approach and planning domains themselves that will soon lead to the dethroning of FF -Replan. Expand
Probabilistic Planning via Determinization in Hindsight
TLDR
An implementation of hindsight optimization for probabilistic planning based on deterministic forward heuristic search is described and its performance on planning-competition benchmarks and other probabilistically interesting problems is evaluated. Expand
Concurrent Probabilistic Planning in the Graphplan Framework
TLDR
This work presents an extension of the full Graphplan framework to probabilistic domains that demonstrates a method of efficiently finding optimal contingency plans using a goal regression search, and produces a planner that is competitive with the state of the art. Expand
The Metric FF Planning System Translating Ignoring Delete Lists to Numeric State Variables
Planning with numeric state variables has been a challenge for many years and was a part of the rd International Planning Competition IPC Currently one of the most popular and successful algorithmicExpand
Equivalence notions and model minimization in Markov decision processes
TLDR
The generalization of bisimulation to stochastic processes yields a non-trivial notion of state equivalence that guarantees the optimal policy for the reduced model immediately induces a corresponding Optimal Policy for the original model. Expand
Labeled RTDP: Improving the Convergence of Real-Time Dynamic Programming
TLDR
This paper introduces a labeling scheme into RTDP that speeds up its convergence while retaining its good anytime behavior, and shows that Labeled RTDP (LRTDP) converges orders of magnitude faster than RTDP, and faster also than another recent heuristic-search DP algorithm, LAO*. Expand
Stochastic dynamic programming with factored representations
TLDR
This work uses dynamic Bayesian networks (with decision trees representing the local families of conditional probability distributions) to represent stochastic actions in an MDP, together with a decision-tree representation of rewards, and develops versions of standard dynamic programming algorithms that directly manipulate decision-Tree representations of policies and value functions. Expand
The FF Planning System: Fast Plan Generation Through Heuristic Search
TLDR
A novel search strategy is introduced that combines hill-climbing with systematic search, and it is shown how other powerful heuristic information can be extracted and used to prune the search space. Expand
...
1
2
3
...