Adaptive Sampling using POMDPs with Domain-Specific Considerations

  title={Adaptive Sampling using POMDPs with Domain-Specific Considerations},
  author={Gautam Salhotra and Chris Denniston and David A. Caron and Gaurav S. Sukhatme},
  journal={2021 IEEE International Conference on Robotics and Automation (ICRA)},
We investigate improving Monte Carlo Tree Search based solvers for Partially Observable Markov Decision Processes (POMDPs), when applied to adaptive sampling problems. We propose improvements in rollout allocation, the action exploration algorithm, and plan commitment. The first allocates a different number of rollouts depending on how many actions the agent has taken in an episode. We find that rollouts are more valuable after some initial information is gained about the environment. Thus, a… 

Figures and Tables from this paper

Learned Parameter Selection for Robotic Information Gathering

This work shows how to automatically configure a planner for informative path planning by training a reinforcement learning agent to select planner parameters at each iteration of informative path Planning.

Informative Path Planning to Estimate Quantiles for Environmental Analysis

Scientists interested in studying natural phenomena often take physical specimens from locations in the environment for later analysis. These analysis locations are typically specified by expert

Fast and Scalable Signal Inference for Active Robotic Source Seeking

This work proposes a global and local factor graph model for active source seeking that allows the model to scale to a large number of measurements and represent unknown obstacles in the environment and demonstrates that this approach outperforms baseline methods in both simulated and real robot experiments.

A Study on Multirobot Quantile Estimation in Natural Environments

This study presents a study across several axes of the impact of using multiple robots to estimate quantiles of a distribution of interest using an informative path planning formulation and finds that while using more robots generally results in lower estimation error, this benefit is achieved under certain conditions.

Adaptive Sampling to Estimate Quantiles for Guiding Physical Sampling

This work proposes to guide scientists’ physical sampling by using a robot to perform an adaptive sampling survey to find locations to suggest that correspond to the quantile values of pre-specified quantiles of interest.

Monte-Carlo Planning in Large POMDPs

POMCP is the first general purpose planner to achieve high performance in such large and unfactored POMDPs as 10 x 10 battleship and partially observable PacMan, with approximately 1018 and 1056 states respectively.

Dec-MCTS: Decentralized planning for multi-robot active perception

This work proposes a decentralized variant of Monte Carlo tree search (MCTS) that is suitable for a variety of tasks in multi-robot active perception and extends the theoretical analysis of standard MCTS to provide guarantees for convergence rates to the optimal payoff sequence.

Sequential Bayesian Optimisation for Spatial-Temporal Monitoring

This work forms Sequential Bayesian Optimisation (SBO) with side-state information within a Partially Observed Markov Decision Process (POMDP) framework that can accommodate discrete and continuous observation spaces and shows that the SBO POMDP optimisation outperforms myopic and non-myopic alternatives.

PLGRIM: Hierarchical Value Learning for Large-scale Exploration in Unknown Environments

This work proposes a scalable value learning framework, PLGRIM (Probabilistic Local and Global Reasoning on Information roadMaps), that bridges the gap between local, risk-aware resiliency and global, reward-seeking mission objectives and addresses large-scale exploration problems while providing locally near-optimal coverage plans.

Sampling-based robotic information gathering algorithms

This work proposes three sampling-based motion planning algorithms for generating informative mobile robot trajectories, and provides analysis of the asymptotic optimality of these algorithms, and presents several conservative pruning strategies for modular, submodular, and time-varying information objectives.

Pilot Surveys for Adaptive Informative Sampling

This paper addresses the case where initial hyperparameters need to be estimated, but no prior data is available, and evaluates four pilot surveys, which use a softmax function on the distance between waypoints and previously sampled data for waypoint selection.

Why Non-myopic Bayesian Optimization is Promising and How Far Should We Look-ahead? A Study via Rollout

This work proves the improving nature of rollout in tackling lookahead BO and provides a theoretical and practical guideline to decide on the rolling horizon stagewise and shows the advantageous properties of the method over several myopic and non-myopic BO algorithms.

Bandit Based Monte-Carlo Planning

A new algorithm is introduced, UCT, that applies bandit ideas to guide Monte-Carlo planning and is shown to be consistent and finite sample bounds are derived on the estimation error due to sampling.

The Bayesian Search Game

  • Marc Toussaint
  • Computer Science
    Theory and Principled Methods for the Design of Metaheuristics
  • 2014
This chapter is to draw links between No Free Lunch theorems which lay the foundation of how to design search heuristics that exploit prior knowledge about the function, partially observable Markov decision processes (POMDP) and their approach to the problem of sequentially and optimally choosing search points, and the use of Gaussian processes as a representation of belief.

Science of Autonomy: Time-Optimal Path Planning and Adaptive Sampling for Swarms of Ocean Vehicles

The science of autonomy is the systematic development of fundamental knowledge about autonomous decision making and task completing in the form of testable autonomous methods, models and systems. In