POMDP-lite for robust robot planning under uncertainty

@article{Chen2016POMDPliteFR,
  title={POMDP-lite for robust robot planning under uncertainty},
  author={Min Chen and Emilio Frazzoli and David Hsu and Wee Sun Lee},
  journal={2016 IEEE International Conference on Robotics and Automation (ICRA)},
  year={2016},
  pages={5427-5433}
}
The partially observable Markov decision process (POMDP) provides a principled general model for planning under uncertainty. However, solving a general POMDP is computationally intractable in the worst case. This paper introduces POMDP-lite, a subclass of POMDPs in which the hidden state variables are constant or only change deterministically. We show that a POMDP-lite is equivalent to a set of fully observable Markov decision processes indexed by a hidden parameter and is useful for modeling a… 

Figures and Tables from this paper

PODDP: Partially Observable Differential Dynamic Programming for Latent Belief Space Planning
TLDR
An efficient differential dynamic programming (DDP) algorithm for belief space planning in POMDPs with uncertainty over a discrete latent state, and continuous states, actions, observations, and nonlinear dynamics is presented.
Efficient Decision-Theoretic Target Localization
TLDR
This work considers target localization, an information-gathering task where an agent takes actions leading to informative observations and a concentrated belief over possible target locations, and extends SARSOP—a state of the art offline solver—to handle belief-dependent rewards, exploring different reward strategies and showing how they can be compactly represented.
Online Risk-Bounded Motion Planning for Autonomous Vehicles in Dynamic Environments
TLDR
This work model the motion planning problem as a partially observable Markov decision process (POMDP) and proposes an online system that combines an intent recognition algorithm and a POMDP solver to generate risk-bounded plans for the ego vehicle navigating with a number of dynamic agent vehicles.
Bayesian Policy Optimization for Model Uncertainty
TLDR
This work formulate the problem of model uncertainty as a continuous Bayes-Adaptive Markov Decision Process (BAMDP), where an agent maintains a posterior distribution over latent model parameters given a history of observations and maximizes its expected long-term reward with respect to this belief distribution.
Data-driven planning via imitation learning
TLDR
A novel data-driven imitation learning framework to efficiently train planning policies by imitating a clairvoyant oracle: an oracle that at train time has full knowledge about the world map and can compute optimal decisions.
Active sensing for motion planning in uncertain environments via mutual information policies
TLDR
The problem is NP-hard and the solution provides a suboptimal, but computationally efficient solution, based on mutual information, that returns a complete policy and a bound on the gap between the policy’s expected cost and the optimal.
Hindsight is Only 50/50: Unsuitability of MDP based Approximate POMDP Solvers for Multi-resolution Information Gathering
TLDR
A set of conditions where MDP based POMDP solvers are provably sub-optimal are derived, using the well-known tiger problem to demonstrate such sub-optimality and it is shown that multi-resolution, budgeted information gathering cannot be addressed using MDPbased POM DP solvers.
ROS-POMDP – A Platform for Robotics Planning using PLPs and RDDL in ROS
TLDR
ROS-POMDP builds on the more realistic POMDP model, with stochastic actions and sensing, and seeks to make it very easy for roboticists to replace hand-written scripts/controller with principled POM DP-based controllers.
Robust and Adaptive Planning under Model Uncertainty
TLDR
The Robust Adaptive Monte Carlo Planning (RAMCP) algorithm is proposed, which allows computation of risk-sensitive Bayes-adaptive policies that optimally trade off exploration, exploitation, and robustness.
Simultaneous active parameter estimation and control using sampling-based Bayesian reinforcement learning
TLDR
This work frames the problem as a Bayes-adaptive Markov decision process and solves it online using Monte Carlo tree search and an extended Kalman filter to handle Gaussian process noise and parameter uncertainty in a continuous space.
...
...

References

SHOWING 1-10 OF 30 REFERENCES
Planning under Uncertainty for Robotic Tasks with Mixed Observability
TLDR
A factored model is used to represent separately the fully and partially observable components of a robot’s state and derive a compact lower-dimensional representation of its belief space that can be combined with any point-based algorithm to compute approximate POMDP solutions.
An online and approximate solver for POMDPs with continuous action space
TLDR
General Pattern Search in Adaptive Belief Tree (GPS-ABT), an approximate and online POMDP solver for problems with continuous action spaces and results on a box pushing and an extended Tag benchmark problem are promising.
Planning how to learn
TLDR
This work presents a simple algorithm for offline POMDP planning in the continuous state space that incorporates learning objectives in the computed plan, which then enables the robot to learn nearly optimally online and reach the goal.
Online Planning Algorithms for POMDPs
TLDR
The objectives here are to survey the various existing online POMDP methods, analyze their properties and discuss their advantages and disadvantages; and to thoroughly evaluate these online approaches in different environments under various metrics.
SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces
TLDR
This work has developed a new point-based POMDP algorithm that exploits the notion of optimally reachable belief spaces to improve com- putational efficiency and substantially outperformed one of the fastest existing point- based algorithms.
Monte-Carlo Planning in Large POMDPs
TLDR
POMCP is the first general purpose planner to achieve high performance in such large and unfactored POMDPs as 10 x 10 battleship and partially observable PacMan, with approximately 1018 and 1056 states respectively.
Intention-Aware Motion Planning
TLDR
Experiments show that the proposed method for constructing a practical model by assuming a finite set of unknown intentions outperforms common alternatives because of its ability in recognizing intentions and using the information effectively for decision making.
DESPOT: Online POMDP Planning with Regularization
TLDR
This paper presents an online POMDP algorithm that alleviates these difficulties by focusing the search on a set of randomly sampled scenarios, and gives an output-sensitive performance bound for all policies derived from a DESPOT, and shows that R-DESPOT works well if a small optimal policy exists.
Intention-aware online POMDP planning for autonomous driving in a crowd
TLDR
This paper presents an intention-aware online planning approach for autonomous driving amid many pedestrians that uses the partially observable Markov decision process (POMDP) for systematic, robust decision making under uncertainty.
...
...