AI2-THOR: An Interactive 3D Environment for Visual AI
- Eric Kolve, Roozbeh Mottaghi, Ali Farhadi
- Computer ScienceArXiv
- 14 December 2017
AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigate in the scenes and interact with objects to perform tasks and facilitate building visually intelligent models.
ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks
- Mohit Shridhar, Jesse Thomason, D. Fox
- Computer ScienceComputer Vision and Pattern Recognition
- 3 December 2019
It is shown that a baseline model based on recent embodied vision-and-language tasks performs poorly on ALFRED, suggesting that there is significant room for developing innovative grounded visual language understanding models with this benchmark.
RoboTHOR: An Open Simulation-to-Real Embodied AI Platform
- Matt Deitke, Winson Han, Ali Farhadi
- Computer ScienceComputer Vision and Pattern Recognition
- 14 April 2020
RoboTHOR offers a framework of simulated environments paired with physical counterparts to systematically explore and overcome the challenges of simulation-to-real transfer, and a platform where researchers across the globe can remotely test their embodied models in the physical world.
ManipulaTHOR: A Framework for Visual Object Manipulation
- Kiana Ehsani, Winson Han, Roozbeh Mottaghi
- Computer ScienceComputer Vision and Pattern Recognition
- 22 April 2021
This work proposes a framework for object manipulation built upon the physics-enabled, visually rich AI2-THOR framework and presents a new challenge to the Embodied AI community known as ArmPointNav, which extends the popular point navigation task to object manipulation and offers new challenges including 3D obstacle avoidance.
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
- Matt Deitke, Eli VanderBilt, Roozbeh Mottaghi
- Computer ScienceArXiv
- 14 June 2022
The proposed PROCTHOR, a framework for procedural generation of Embodied AI environments, enables us to sample arbitrarily large datasets of diverse, interactive, customizable, and performant virtual environments to train and evaluate embodied agents across navigation, interaction, and manipulation tasks.
Learning Generalizable Visual Representations via Interactive Gameplay
- Luca Weihs, Aniruddha Kembhavi, Ali Farhadi
- Computer ScienceInternational Conference on Learning…
- 2021
Artificial Agents Learn Flexible Visual Representations by Playing a Hiding Game
- Luca Weihs, Aniruddha Kembhavi, Ali Farhadi
- BiologyArXiv
- 17 December 2019
This work is the first to show that embodied adversarial reinforcement learning agents playing cache, a variant of hide-and-seek, in a high fidelity, interactive, environment, learn representations of their observations encoding information such as occlusion, object permanence, free space, and containment.
Retrospectives on the Embodied AI Workshop
- Matt Deitke, Dhruv Batra, Jiajun Wu
- Computer ScienceArXiv
- 13 October 2022
A retrospective on the state of Embodied AI research is presented and 13 challenges presented at the EmbodiedAI Workshop at CVPR are grouped into three themes: visual navigation, rearrangement and integration.