ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

@article{Shridhar2020ALFREDAB,
  title={ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks},
  author={Mohit Shridhar and Jesse Thomason and Daniel Gordon and Yonatan Bisk and Winson Han and R. Mottaghi and Luke Zettlemoyer and D. Fox},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020},
  pages={10737-10746}
}
We present ALFRED (Action Learning From Realistic Environments and Directives), a benchmark for learning a mapping from natural language instructions and egocentric vision to sequences of actions for household tasks. ALFRED includes long, compositional tasks with non-reversible state changes to shrink the gap between research benchmarks and real-world applications. ALFRED consists of expert demonstrations in interactive visual environments for 25k natural language directives. These directives… Expand
43 Citations
ALFWorld: Aligning Text and Embodied Environments for Interactive Learning
  • 2
  • PDF
MOCA: A Modular Object-Centric Approach for Interactive Instruction Following
  • PDF
Are We There Yet? Learning to Localize in Embodied Instruction Following
  • Highly Influenced
  • PDF
Language as a Cognitive Tool to Imagine Goals in Curiosity-Driven Exploration
  • 11
  • PDF
Towards Ecologically Valid Research on Language User Interfaces
  • 10
  • PDF
A modular vision language navigation and manipulation framework for long horizon compositional tasks in indoor environment
  • Highly Influenced
  • PDF
Grounding Language in Play
  • 11
  • PDF
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 62 REFERENCES
Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments
  • 347
  • Highly Influential
  • PDF
Speaker-Follower Models for Vision-and-Language Navigation
  • 136
  • PDF
Tell me Dave: Context-sensitive grounding of natural language to manipulation instructions
  • 116
  • PDF
Towards a Dataset for Human Computer Communication via Grounded Language Acquisition
  • 19
  • PDF
Grounding Robot Plans from Natural Language Instructions with Incomplete World Knowledge
  • 14
  • PDF
Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions
  • 316
  • PDF
Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions
  • 370
  • PDF
Neural Modular Control for Embodied Question Answering
  • 59
  • PDF
...
1
2
3
4
5
...