• Corpus ID: 238583382

Learning to Follow Language Instructions with Compositional Policies

  title={Learning to Follow Language Instructions with Compositional Policies},
  author={Vanya Cohen and Geraud Nangue Tasse and Nakul Gopalan and Steven James and Matthew Craig Gombolay and Benjamin Rosman},
We propose a framework that learns to execute natural language instructions in an environment consisting of goalreaching tasks that share components of their task descriptions. Our approach leverages the compositionality of both value functions and language, with the aim of reducing the sample complexity of learning novel tasks. First, we train a reinforcement learning agent to learn value functions that can be subsequently composed through a Boolean algebra to solve novel tasks. Second, we… 

Figures and Tables from this paper


Compositional RL Agents That Follow Language Commands in Temporal Logic
A novel form of multi-task learning for RL agents is developed that allows them to learn from a diverse set of tasks and generalize to a new set of diverse tasks without any additional training.
Learning to Parse Natural Language to Grounded Reward Functions with Weak Supervision
It is shown that parsing models learned from small data sets can generalize to commands not seen during training, and enables an improvement of orders of magnitude in computation time over a baseline that performs planning during learning, while achieving comparable results.
Gated-Attention Architectures for Task-Oriented Language Grounding
An end-to-end trainable neural architecture for task-oriented language grounding in 3D environments which assumes no prior linguistic or perceptual knowledge and requires only raw pixels from the environment and the natural language instruction as input.
Simultaneously Learning Transferable Symbols and Language Groundings from Perceptual Data for Instruction Following
This work proposes to first learning symbolic abstractions from demonstration data and then mapping language to those learned abstractions, which can be learned with significantly less data than end-to-end approaches, and support partial behavior specification via natural language since they permit planning using traditional planners.
Interpretable Policy Specification and Synthesis through Natural Language and RL
A novel machine learning framework that enables humans to specify, through natural language, interpretable policies in the form of easy-to-understand decision trees, leverage these policies to warm-start reinforcement learning and outperform baselines that lack the authors' natural language initialization mechanism is proposed.
Generalization without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks
This paper introduces the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences, and tests the zero-shot generalization capabilities of a variety of recurrent neural networks trained on SCAN with sequence-to-sequence methods.
What to do and how to do it: Translating natural language directives into temporal and dynamic logic representation for goal management and action execution
An integrated robotic architecture is described that can achieve the above steps by translating natural language instructions incrementally and simultaneously into formal logical goal description and action languages, which can be used both to reason about the achievability of a goal as well as to generate new action scripts to pursue the goal.
Composing Entropic Policies using Divergence Correction
An important generalization of policy improvement to the maximum entropy framework is extended and an algorithm for the practical implementation of successor features in continuous action spaces is introduced and a novel approach is proposed which addresses the failure cases of prior work and recovers the optimal policy during transfer.
Language Models are Unsupervised Multitask Learners
It is demonstrated that language models begin to learn these tasks without any explicit supervision when trained on a new dataset of millions of webpages called WebText, suggesting a promising path towards building language processing systems which learn to perform tasks from their naturally occurring demonstrations.
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.