Learning Q-network for Active Information Acquisition

@article{Jeong2019LearningQF,
  title={Learning Q-network for Active Information Acquisition},
  author={Heejin Jeong and Brent Schlotfeldt and Hamed Hassani and Manfred Morari and Daniel D. Lee and George J. Pappas},
  journal={2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2019},
  pages={6822-6827}
}
In this paper, we propose a novel Reinforcement Learning approach for solving the Active Information Acquisition problem, which requires an agent to choose a sequence of actions in order to acquire information about a process of interest using on-board sensors. The classic challenges in the information acquisition problem are the dependence of a planning algorithm on known models and the difficulty of computing information-theoretic cost functions over arbitrary distributions. In contrast, the… 

Figures from this paper

Learning to Track Dynamic Targets in Partially Known Environments

This work introduces Active Tracking Target Network (ATTN), a unified RL policy that is capable of solving major sub-tasks of active target tracking -- in-sight tracking, navigation, and exploration and shows robust behavior for tracking agile and anomalous targets with a partially known target model.

Where to Look Next: Learning Viewpoint Recommendations for Informative Trajectory Planning

This work trains an information-aware policy via deep reinforcement learning, that guides a receding-horizon trajectory optimization planner, such that the resulting dynamically feasible and collision-free trajectories lead to observations that maximize the information gain and reduce the uncertainty about the environment.

Concurrent Learning Based Dual Control for Exploration and Exploitation in Autonomous Search

A concurrent learning framework for source search in an unknown environment using autonomous platforms equipped with onboard sensors that not only guarantees convergence, but produces better search performance and consumes much less computational time.

Active localization of multiple targets using noisy relative measurements

This work forms this path planning problem as an unsupervised learning problem where the measurements are aggregated using a Bayesian histogram filter and the robot learns to minimize the total uncertainty of each target in the shortest amount of time.

Active Localization of Multiple Targets from Noisy Relative Measurements

This paper presents a mobile robot tasked with localizing targets at unknown locations by obtaining relative measurements by moving so as to localize the targets and minimize the uncertainty in their locations as quickly as possible.

Scalable Reinforcement Learning Policies for Multi-Agent Control

A masking heuristic is developed that allows training on smaller problems with few pursuers-targets and execution on much larger problems and is discussed how it enables a hedging behavior between pursuers that leads to a weak form of cooperation in spite of completely decentralized control execution.

Learning Continuous Control Policies for Information-Theoretic Active Perception

A mobile robot detecting landmarks within a limited sensing range is considered, and the problem of learning a control policy that maximizes the mutual information between the landmark states and the sensor observations is tackled.

Graph Neural Networks for Multi-Robot Active Information Acquisition

An Information-aware Graph Block Network (I-GBNet), an AIA adaptation of Graph Neural Networks, that aggregates information over the graph represen- tation and provides sequential-decision making in a distributed manner is proposed.

Deep Reinforcement Learning for Active Target Tracking

This work introduces Active Tracking Target Network (ATTN), a unified deep RL policy that is capable of solving major sub-tasks of active target tracking – in-sight tracking, navigation, and exploration.

Concurrent Active Learning in Autonomous Airborne Source Search: Dual Control for Exploration and Exploitation

Compared with the information-theoretic approach, CL-DCEE not only guarantees convergence, but produces better search performance and consumes much less computational time.

References

SHOWING 1-10 OF 32 REFERENCES

Learning to gather information via imitation

This paper presents an efficient algorithm, EXPLORE, that trains a policy on the target distribution to imitate a clairvoyant oracle — an oracle that has full information about the world and computes non-myopic solutions to maximize information gathered.

Reinforcement learning in robotics: A survey

This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes.

Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates

It is demonstrated that a recent deep reinforcement learning algorithm based on off-policy training of deep Q-functions can scale to complex 3D manipulation tasks and can learn deep neural network policies efficiently enough to train on real physical robots.

Reinforcement Learning: An Introduction

This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

Sampling-based Motion Planning for Robotic Information Gathering

An incremental sampling-based motion planning algorithm that generates maximally informative trajectories for guiding mobile robots to observe their environment and provides a rigorous analysis of the asymptotic optimality of this approach.

Assumed Density Filtering Q-learning

A novel Bayesian approach to off-policy TD methods, called ADFQ, which updates beliefs on state-action values, Q, through an online Bayesian inference method known as Assumed Density Filtering, which outperforms comparable algorithms on various Atari 2600 games, with drastic improvements in highly stochastic domains or domains with a large action space.

Efficient Multi-robot Search for a Moving Target

It is proved that solving the MESPP problem requires maximizing a non-decreasing, submodular objective function, which leads to theoretical bounds on the performance of the proposed linearly scalable approximation algorithm.

Human-level control through deep reinforcement learning

This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.

Reinforcement Learning for Humanoid Robotics

This paper discusses different approaches of reinforcement learning in terms of their applicability in humanoid robotics, and demonstrates that ‘vanilla’ policy gradient methods can be significantly improved using the natural policy gradient instead of the regular policy gradient.

Anytime Planning for Decentralized Multirobot Active Information Gathering

An anytime planning algorithm is developed that progressively reduces the suboptimality of the information gathering plans while respecting real-time constraints and enable robust and scalable information gathering using a team of agile robots that adapt their cooperation to timing constraints and ad hoc communication without the need for external or centralized computation.