AI2-THOR: An Interactive 3D Environment for Visual AI
- Eric Kolve, Roozbeh Mottaghi, Ali Farhadi
- Computer ScienceArXiv
- 14 December 2017
AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigate in the scenes and interact with objects to perform tasks and facilitate building visually intelligent models.
Target-driven visual navigation in indoor scenes using deep reinforcement learning
- Yuke Zhu, Roozbeh Mottaghi, Ali Farhadi
- Computer ScienceIEEE International Conference on Robotics and…
- 16 September 2016
This paper proposes an actor-critic model whose policy is a function of the goal as well as the current state, which allows better generalization and proposes the AI2-THOR framework, which provides an environment with high-quality 3D scenes and a physics engine.
A Diagram is Worth a Dozen Images
- Aniruddha Kembhavi, M. Salvato, Eric Kolve, Minjoon Seo, Hannaneh Hajishirzi, Ali Farhadi
- Computer ScienceEuropean Conference on Computer Vision
- 24 March 2016
An LSTM-based method for syntactic parsing of diagrams and a DPG-based attention model for diagram question answering are devised and a new dataset of diagrams with exhaustive annotations of constituents and relationships is compiled.
RoboTHOR: An Open Simulation-to-Real Embodied AI Platform
- Matt Deitke, Winson Han, Ali Farhadi
- Computer ScienceComputer Vision and Pattern Recognition
- 14 April 2020
RoboTHOR offers a framework of simulated environments paired with physical counterparts to systematically explore and overcome the challenges of simulation-to-real transfer, and a platform where researchers across the globe can remotely test their embodied models in the physical world.
Visual Semantic Planning Using Deep Successor Representations
- Yuke Zhu, Daniel Gordon, Ali Farhadi
- Computer ScienceIEEE International Conference on Computer Vision
- 23 May 2017
This work addresses the problem of visual semantic planning: the task of predicting a sequence of actions from visual observations that transform a dynamic environment from an initial state to a goal state, and develops a deep predictive model based on successor representations.
CORA: Benchmarks, Baselines, and Metrics as a Platform for Continual Reinforcement Learning Agents
- Sam Powers, Eliot Xing, Eric Kolve, Roozbeh Mottaghi, A. Gupta
- Computer ScienceArXiv
- 19 October 2021
Progress in continual reinforcement learning has been limited due to several barriers to entry: missing code, high compute requirements
A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks
- Unnat Jain, Luca Weihs, A. Schwing
- Computer ScienceEuropean Conference on Computer Vision
- 9 July 2020
The novel task FurnMove is introduced, in which agents work together to move a piece of furniture through a living room to a goal, and SYNC-policies (synchronize your actions coherently) and CORDIAL (coordination loss) are introduced.
ManipulaTHOR: A Framework for Visual Object Manipulation
- Kiana Ehsani, Winson Han, Roozbeh Mottaghi
- Computer ScienceComputer Vision and Pattern Recognition
- 22 April 2021
This work proposes a framework for object manipulation built upon the physics-enabled, visually rich AI2-THOR framework and presents a new challenge to the Embodied AI community known as ArmPointNav, which extends the popular point navigation task to object manipulation and offers new challenges including 3D obstacle avoidance.
Webly Supervised Concept Expansion for General Purpose Vision Models
- Amita Kamath, Christopher Clark, Tanmay Gupta, Eric Kolve, Derek Hoiem, Aniruddha Kembhavi
- Computer ScienceEuropean Conference on Computer Vision
- 4 February 2022
This work uses a dataset of 1M+ images spanning 10k+ visual concepts to demonstrate webly-supervised concept expansion for two existing GPVs and proposes a new architecture, GPV-2, that supports a variety of tasks — from vision tasks like classification and localization to vision+language tasks like QA and captioning, to more niche ones like human-object interaction detection.
ProcTHOR: Large-Scale Embodied AI Using Procedural Generation
- Matt Deitke, Eli VanderBilt, Roozbeh Mottaghi
- Computer ScienceArXiv
- 14 June 2022
The proposed PROCTHOR, a framework for procedural generation of Embodied AI environments, enables us to sample arbitrarily large datasets of diverse, interactive, customizable, and performant virtual environments to train and evaluate embodied agents across navigation, interaction, and manipulation tasks.
...
...