Factored World Models for Zero-Shot Generalization in Robotic Manipulation
@article{Biza2022FactoredWM, title={Factored World Models for Zero-Shot Generalization in Robotic Manipulation}, author={Ondrej Biza and Thomas Kipf and David Klee and Robert W. Platt and J.-W. van de Meent and Lawson L. S. Wong}, journal={ArXiv}, year={2022}, volume={abs/2202.05333} }
World models for environments with many objects face a combinatorial explosion of states: as the number of objects increases, the number of possible arrangements grows exponentially. In this paper, we learn to generalize over robotic pick-and-place tasks using object-factored world models, which combat the combinatorial explosion by ensuring that predictions are equivariant to permutations of objects. Previous objectfactored models were limited either by their inability to model actions, or by…
Figures and Tables from this paper
4 Citations
Planning for Multi-Object Manipulation with Graph Neural Network Relational Classifiers
- Computer ScienceArXiv
- 2022
A novel graph neural network framework for multi-object manipulation to predict how inter-object relations change given robot actions, which enables multi-step planning to reach target goal relations and shows the model trained purely in simulation transfers well to the real world.
Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation
- Computer ScienceArXiv
- 2022
This paper proposes to use structured world models to incorporate relational inductive biases in the control loop to achieve sample-efficient and interaction-rich exploration in compositional multi-object environments and showcases that the self-reinforcing cycle between good models and good exploration also opens up another avenue: zero-shot generalization to downstream tasks via model-based planning.
Binding Actions to Objects in World Models
- Computer ScienceArXiv
- 2022
Two attention mechanisms for binding actions to objects, soft attention and hard attention, are proposed and shown to help contrastively-trained structured world models to learn to separate individual objects in an object-based grid-world environment.
B INDING A CTIONS TO O BJECTS IN W ORLD M ODELS
- Computer Science
- 2022
Two attention mechanisms for binding actions to objects, soft attention and hard attention, are proposed and it is shown that hard attention helps contrastively-trained structured world models to learn to separate individual objects in an object-based grid-world environment.
References
SHOWING 1-10 OF 42 REFERENCES
Towards Practical Multi-Object Manipulation using Relational Reinforcement Learning
- Computer Science2020 IEEE International Conference on Robotics and Automation (ICRA)
- 2020
It is shown that graph-based relational architectures overcome this limitation and enable learning of complex tasks when provided with a simple curriculum of tasks with increasing numbers of objects, and exhibits zero-shot generalization.
Object-centric Forward Modeling for Model Predictive Control
- Computer ScienceCoRL
- 2019
An approach to learn an object-centric forward model that can be leveraged to search for action sequences that lead to desired goal configurations, and that in conjunction with a learned correction module, this allows for robust closed loop execution.
Reasoning About Physical Interactions with Object-Oriented Prediction and Planning
- Computer ScienceICLR
- 2019
This work presents a paradigm for learning object-centric representations for physical scene understanding without direct supervision of object properties, and can use its learned representations to build block towers more complicated than those observed during training.
Action Priors for Large Action Spaces in Robotics
- Computer ScienceAAMAS
- 2021
This paper proposes an alternative approach where the solutions of previously solved tasks are used to produce an action prior that can facilitate exploration in future tasks.
Visual Reinforcement Learning with Imagined Goals
- Computer ScienceNeurIPS
- 2018
An algorithm is proposed that acquires general-purpose skills by combining unsupervised representation learning and reinforcement learning of goal-conditioned policies, efficient enough to learn policies that operate on raw image observations and goals for a real-world robotic system, and substantially outperforms prior techniques.
Structured Object-Aware Physics Prediction for Video Modeling and Planning
- Computer ScienceICLR
- 2020
STOVE is presented, a novel state-space model for videos, which explicitly reasons about objects and their positions, velocities, and interactions, and outperforms previous unsupervised models, and even approaches the performance of supervised baselines.
Graph networks as learnable physics engines for inference and control
- Computer ScienceICML
- 2018
A new class of learnable models are introduced--based on graph networks--which implement an inductive bias for object- and relation-centric representations of complex, dynamical systems, and offers new opportunities for harnessing and exploiting rich knowledge about the world.
Interaction Networks for Learning about Objects, Relations and Physics
- Computer Science, PhysicsNIPS
- 2016
The interaction network is introduced, a model which can reason about how objects in complex systems interact, supporting dynamical predictions, as well as inferences about the abstract properties of the system, and is implemented using deep neural networks.
An algebraic approach to abstraction in reinforcement learning
- Computer Science
- 2004
This work introduces relativized options, a generalization of Markov sub-goal options, that allow us to define options without an absolute frame of reference and introduces an extension to the options framework that allows us to learn simultaneously at multiple levels of the hierarchy guarantees regarding the performance of hierarchical systems that employ approximate in several test-beds.
Contrastive Learning of Structured World Models
- Computer ScienceICLR
- 2020
These experiments demonstrate that C-SWMs can overcome limitations of models based on pixel reconstruction and outperform typical representatives of this model class in highly structured environments, while learning interpretable object-based representations.