Learning to Map for Active Semantic Goal Navigation
@article{Georgakis2021LearningTM, title={Learning to Map for Active Semantic Goal Navigation}, author={Georgios Georgakis and Bernadette Bucher and Karl Schmeckpeper and Siddharth Singh and Kostas Daniilidis}, journal={ArXiv}, year={2021}, volume={abs/2106.15648} }
We consider the problem of object goal navigation in unseen environments. In our view, solving this problem requires learning of contextual semantic priors, a challenging endeavour given the spatial and semantic variability of indoor environments. Current methods learn to implicitly encode these priors through goal-oriented navigation policy functions operating on spatial representations that are limited to the agent’s observable areas. In this work, we propose a novel framework that actively…
Figures and Tables from this paper
23 Citations
PEANUT: Predicting and Navigating to Unseen Targets
- Computer ScienceArXiv
- 2022
This work's prediction model is lightweight and can be trained in a supervised manner using a relatively small amount of passively collected data, and achieves the state-of-the-art on both datasets, despite not using any additional data for training.
Uncertainty-driven Planner for Exploration and Navigation
- Computer Science2022 International Conference on Robotics and Automation (ICRA)
- 2022
A novel planning framework is presented that first learns to generate occupancy maps beyond the field-of-view of the agent, and second leverages the model uncertainty over the generated areas to formulate path selection policies for each task of interest.
Navigating to Objects in Unseen Environments by Distance Prediction
- Computer Science2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
- 2022
This task is to navigate an agent to an object category in unseen environments without a pre-built map by predicting the distance to the target using semantically-related objects as cues using a bird's-eye view semantic map as input.
Comparison of Model-Free and Model-Based Learning-Informed Planning for PointGoal Navigation
- Computer ScienceArXiv
- 2022
This work adapts the POMDP subgoal framework proposed by [1] and modify the component that estimates frontier properties by using partial semantic maps of indoor scenes built from images’ semantic segmentation, demonstrating that it is robust and efficient in that it leverages informative, learned properties of the frontiers compared to an optimistic frontier-based planner.
A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations
- Computer ScienceArXiv
- 2022
A contextual-bandit agent efficiently explores the environment by showing optimism in the face of uncertainty and learns a model of the likelihood of spotting different objects from each navigable location that is able to efficiently search indoor environments for not just static objects but also movable objects.
3D-Aware Object Goal Navigation via Simultaneous Exploration and Identification
- Computer ScienceArXiv
- 2022
This work proposes a framework for the challenging 3D-aware ObjectNav based on two straightforward sub-policies, which achieves the best performance among all modular-based methods on the Matterport3D and Gibson datasets, while requiring (up to 30x) less computational cost for training.
Weakly-Supervised Multi-Granularity Map Learning for Vision-and-Language Navigation
- Computer ScienceArXiv
- 2022
A multi-granularity map, which contains both object-grained details and semantic classes, to represent objects more comprehensively and a weakly-supervised auxiliary task, which requires the agent to localize instruction-relevant objects on the map.
Self-supervised Pre-training for Semantic Segmentation in an Indoor Scene
- Computer ScienceArXiv
- 2022
RegConsist is proposed, a method for self-supervised pre-training of a semantic segmentation model, exploiting the ability of the agent to move and register multiple views in the novel environment, outperforms models pre-trained on ImageNet and achieves competitive performance when using models that are trained for exactly the same task but on a different dataset.
Cross-modal Map Learning for Vision and Language Navigation
- Computer Science2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2022
A cross-modal map learning model for vision-and-language navigation is proposed that first learns to predict the top-down semantics on an egocentric map for both observed and unobserved regions, and then predicts a path towards the goal as a set of way-points.
Active Exploration based on Information Gain by Particle Filter for Efficient Spatial Concept Formation
- Computer ScienceArXiv
- 2022
An active inference method, spatial concept formation with information gain-based active exploration (SpCoAE), that combines sequential Bayesian inference by particle filters and position determination based on information gain in a probabilistic generative model is proposed.
References
SHOWING 1-10 OF 68 REFERENCES
Visual Semantic Navigation using Scene Priors
- Computer ScienceICLR
- 2019
This work proposes to use Graph Convolutional Networks for incorporating the prior knowledge into a deep reinforcement learning framework and shows how semantic knowledge improves performance significantly and improves in generalization to unseen scenes and/or objects.
SSCNav: Confidence-Aware Semantic Scene Completion for Visual Semantic Navigation
- Computer Science2021 IEEE International Conference on Robotics and Automation (ICRA)
- 2021
This paper introduces SSCNav, an algorithm that explicitly models scene priors using a confidence-aware semantic scene completion module to complete the scene and guide the agent's navigation planning and demonstrates that the proposed scenepletion module improves the efficiency of the downstream navigation policies.
Uncertainty-driven Planner for Exploration and Navigation
- Computer Science2022 International Conference on Robotics and Automation (ICRA)
- 2022
A novel planning framework is presented that first learns to generate occupancy maps beyond the field-of-view of the agent, and second leverages the model uncertainty over the generated areas to formulate path selection policies for each task of interest.
Cognitive Mapping and Planning for Visual Navigation
- Computer ScienceInternational Journal of Computer Vision
- 2019
The Cognitive Mapper and Planner is based on a unified joint architecture for mapping and planning, such that the mapping is driven by the needs of the task, and a spatial memory with the ability to plan given an incomplete set of observations about the world.
Seeing the Un-Scene: Learning Amodal Semantic Maps for Room Navigation
- Computer ScienceECCV
- 2020
A learning-based approach for room navigation using semantic maps that learns to predict top-down belief maps of regions that lie beyond the agent's field of view while modeling architectural and stylistic regularities in houses.
Semantic Curiosity for Active Visual Learning
- Computer ScienceECCV
- 2020
The exploration policy trained via semantic curiosity generalizes to novel scenes and helps train an object detector that outperforms baselines trained with other possible alternatives such as random exploration, prediction-error curiosity, and coverage-maximizing exploration.
MultiON: Benchmarking Semantic Map Memory using Multi-Object Navigation
- Computer ScienceNeurIPS
- 2020
This work proposes the multiON task, which requires navigation to an episode-specific sequence of objects in a realistic environment and generalizes the ObjectGoal navigation task and explicitly tests the ability of navigation agents to locate previously observed goal objects.
Target-driven visual navigation in indoor scenes using deep reinforcement learning
- Computer Science2017 IEEE International Conference on Robotics and Automation (ICRA)
- 2017
This paper proposes an actor-critic model whose policy is a function of the goal as well as the current state, which allows better generalization and proposes the AI2-THOR framework, which provides an environment with high-quality 3D scenes and a physics engine.
Simultaneous Mapping and Target Driven Navigation
- Computer ScienceArXiv
- 2019
It is demonstrated that the use of semantic information improves localization accuracy and the ability of storing spatial semantic map aids the target driven navigation policy.
Neural Topological SLAM for Visual Navigation
- Computer Science2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
- 2020
This paper designs topological representations for space that effectively leverage semantics and afford approximate geometric reasoning, and describes supervised learning-based algorithms that can build, maintain and use such representations under noisy actuation.