Ask4Help: Learning to Leverage an Expert for Embodied Tasks
@article{Singh2022Ask4HelpLT, title={Ask4Help: Learning to Leverage an Expert for Embodied Tasks}, author={Kunal Pratap Singh and Luca Weihs and Alvaro Herrasti and Jonghyun Choi and Aniruddha Kemhavi and Roozbeh Mottaghi}, journal={ArXiv}, year={2022}, volume={abs/2211.09960} }
Embodied AI agents continue to become more capable every year with the advent of new models, environments, and benchmarks, but are still far away from being performant and reliable enough to be deployed in real, user-facing, applications. In this paper, we ask: can we bridge this gap by enabling agents to ask for assistance from an expert such as a human being? To this end, we propose the A SK 4H ELP policy that augments agents with the ability to request, and then use expert assistance. A SK…
References
SHOWING 1-10 OF 58 REFERENCES
AllenAct: A Framework for Embodied AI Research
- Computer ScienceArXiv
- 2020
AllenAct is introduced, a modular and flexible learning framework designed with a focus on the unique requirements of Embodied AI research that provides first-class support for a growing collection of embodied environments, tasks and algorithms.
Just Ask: An Interactive Learning Framework for Vision and Language Navigation
- Computer ScienceAAAI
- 2020
This work proposes an interactive learning framework to endow the agent with the ability to ask for users' help in ambiguous situations and designs a continual learning strategy, which can be viewed as a data augmentation method, for the agent to improve further utilizing its interaction history with a human.
Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning
- Computer ScienceEMNLP
- 2019
“Help, Anna!” (HANNA), an interactive photo-realistic simulator in which an agent fulfills object-finding tasks by requesting and interpreting natural language-and-vision assistance, and an imitation learning algorithm that teaches the agent to avoid repeating past mistakes while simultaneously predicting its own chances of making future progress.
Auxiliary Tasks and Exploration Enable ObjectGoal Navigation
- Computer Science2021 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2021
This work proposes that agents will act to simplify their visual inputs so as to smooth their RNN dynamics, and that auxiliary tasks reduce overfitting by minimizing effective RNN dimensionality; i.e. a performant ObjectNav agent that must maintain coherent plans over long horizons does so by learning smooth, low-dimensional recurrent dynamics.
Vision-and-Dialog Navigation
- Computer ScienceCoRL
- 2019
This work introduces Cooperative Vision-and-Dialog Navigation, a dataset of over 2k embodied, human-human dialogs situated in simulated, photorealistic home environments and establishes an initial, multi-modal sequence-to-sequence model.
Asking for Help Using Inverse Semantics
- Computer ScienceRobotics: Science and Systems
- 2014
This work demonstrates an approach for enabling a robot to recover from failures by communicating its need for specific help to a human partner using natural language, and presents a novel inverse semantics algorithm for generating effective help requests.
Recovering from failure by asking for help
- Computer ScienceAuton. Robots
- 2015
This work demonstrates an approach for enabling a robot to recover from failures by communicating its need for specific help to a human partner using natural language, and presents a novel inverse semantics algorithm for generating effective help requests.
Auxiliary Tasks Speed Up Learning PointGoal Navigation
- Computer ScienceCoRL
- 2020
This work develops a method to significantly increase sample and time efficiency in learning PointNav using self-supervised auxiliary tasks (e.g. predicting the action taken between two egocentric observations, predicting the distance between two observations from a trajectory, etc.).
THDA: Treasure Hunt Data Augmentation for Semantic Navigation
- Computer Science2021 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2021
This paper shows that the key problem is overfitting in ObjectNav, and introduces Treasure Hunt Data Augmentation (THDA) to address overfitting.
IQA: Visual Question Answering in Interactive Environments
- Computer Science2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
- 2018
The Hierarchical Interactive Memory Network (HIMN), consisting of a factorized set of controllers, allowing the system to operate at multiple levels of temporal abstraction, is proposed, and outperforms popular single controller based methods on IQUAD V1.