Habitat: A Platform for Embodied AI Research

  title={Habitat: A Platform for Embodied AI Research},
  author={Manolis Savva and Abhishek Kadian and Oleksandr Maksymets and Yili Zhao and Erik Wijmans and Bhavana Jain and Julian Straub and Jia Liu and Vladlen Koltun and Jitendra Malik and Devi Parikh and Dhruv Batra},
  journal={2019 IEEE/CVF International Conference on Computer Vision (ICCV)},
We present Habitat, a platform for research in embodied artificial intelligence (AI). Habitat enables training embodied agents (virtual robots) in highly efficient photorealistic 3D simulation. Specifically, Habitat consists of: (i) Habitat-Sim: a flexible, high-performance 3D simulator with configurable agents, sensors, and generic 3D dataset handling. Habitat-Sim is fast -- when rendering a scene from Matterport3D, it achieves several thousand frames per second (fps) running single-threaded… 

Habitat 2.0: Training Home Assistants to Rearrange their Habitat

Habitat 2.0 is introduced, a simulation platform for training virtual robots in interactive 3D environments and complex physics-enabled scenarios, and large-scale engineering contributions allow us to systematically compare deep reinforcement learning at scale and classical sense-plan-act pipelines in long-horizon structured tasks.

ROS-X-Habitat: Bridging the ROS Ecosystem with Embodied AI

ROS-X-Habitat is introduced, a software interface that bridges the AI Habitat platform for embodied learning-based agents with other robotics resources via ROS and enables physically and photorealistic simulation that benefits the training and/or testing of vision-based embodied agents.

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

RoboTHOR offers a framework of simulated environments paired with physical counterparts to systematically explore and overcome the challenges of simulation-to-real transfer, and a platform where researchers across the globe can remotely test their embodied models in the physical world.

Out of the Box: Embodied Navigation in the Real World

This work describes the architectural discrepancies that damage the Sim2Real adaptation ability of models trained on the Habitat simulator and proposes a novel solution tailored towards the deployment in realworld scenarios.

AllenAct: A Framework for Embodied AI Research

AllenAct is introduced, a modular and flexible learning framework designed with a focus on the unique requirements of Embodied AI research that provides first-class support for a growing collection of embodied environments, tasks and algorithms.

Megaverse: Simulating Embodied Agents at One Million Experiences per Second

The efficient design of the engine enables physics-based simulation with highdimensional egocentric observations at more than 1,000,000 actions per second on a single 8-GPU node, thereby taking full advantage of the massive parallelism of modern GPUs.

ProcTHOR: Large-Scale Embodied AI Using Procedural Generation

The proposed PROCTHOR, a framework for procedural generation of Embodied AI environments, enables us to sample arbitrarily large datasets of diverse, interactive, customizable, and performant virtual environments to train and evaluate embodied agents across navigation, interaction, and manipulation tasks.

Deep Reinforcement Learning on a Budget: 3D Control and Reasoning Without a Supercomputer

This paper argues that the community should be able to address challenging problems in reasoning of mobile agents without the need for a large compute infrastructure, and provides baselines for challenging scenarios where new ideas can be iterated on quickly.

BEHAVIOR-1K: A Benchmark for Embodied AI with 1,000 Everyday Activities and Realistic Simulation

The experiments indicate that the activities in BEHAVIOR-1K are long-horizon and dependent on complex manipulation skills, both of which remain a challenge for even state-of-the-art robot learning solutions.

SAPIEN: A SimulAted Part-Based Interactive ENvironment

  • Fanbo XiangYuzhe Qin Hao Su
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
SAPIEN is a realistic and physics-rich simulated environment that hosts a large-scale set of articulated objects that enables various robotic vision and interaction tasks that require detailed part-level understanding and hopes it will open research directions yet to be explored.



Gibson Env: Real-World Perception for Embodied Agents

This paper investigates developing real-world perception for active agents, proposes Gibson Environment for this purpose, and showcases a set of perceptual tasks learned therein.

Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments

This work provides the first benchmark dataset for visually-grounded natural language navigation in real buildings - the Room-to-Room (R2R) dataset and presents the Matter-port3D Simulator - a large-scale reinforcement learning environment based on real imagery.

Building Generalizable Agents with a Realistic and Rich 3D Environment

House3D is built, a rich, extensible and efficient environment that contains 45,622 human-designed 3D scenes of houses, equipped with a diverse set of fully labeled 3D objects, textures and scene layouts, based on the SUNCG dataset and an emphasis on semantic-level generalization.

Cognitive Mapping and Planning for Visual Navigation

The Cognitive Mapper and Planner is based on a unified joint architecture for mapping and planning, such that the mapping is driven by the needs of the task, and a spatial memory with the ability to plan given an incomplete set of observations about the world.

Learning agile and dynamic motor skills for legged robots

This work introduces a method for training a neural network policy in simulation and transferring it to a state-of-the-art legged system, thereby leveraging fast, automated, and cost-effective data generation schemes.

On Evaluation of Embodied Navigation Agents

The present document summarizes the consensus recommendations of a working group to study empirical methodology in navigation research and discusses different problem statements and the role of generalization, present evaluation measures, and provides standard scenarios that can be used for benchmarking.

The Replica Dataset: A Digital Replica of Indoor Spaces

Replica, a dataset of 18 highly photo-realistic 3D indoor scene reconstructions at room and building scale, is introduced to enable machine learning (ML) research that relies on visually, geometrically, and semantically realistic generative models of the world.

MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments

MINOS is used to benchmark deep-learning-based navigation methods, to analyze the influence of environmental complexity on navigation performance, and to carry out a controlled study of multimodality in sensorimotor learning.

HoME: a Household Multimodal Environment

HoME is an open-source, OpenAI Gym-compatible platform extensible to tasks in reinforcement learning, language grounding, sound-based navigation, robotics, multi-agent learning, and more that better enables artificial agents to learn as humans do: in an interactive, multimodal, and richly contextualized setting.

AI2-THOR: An Interactive 3D Environment for Visual AI

AI2-THOR consists of near photo-realistic 3D indoor scenes, where AI agents can navigate in the scenes and interact with objects to perform tasks and facilitate building visually intelligent models.