AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions

@inproceedings{Wang2022AdaAffordLT,
  title={AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions},
  author={Yian Wang and Ruihai Wu and Kaichun Mo and Jiaqi Ke and Qingnan Fan and Leonidas J. Guibas and Hao Dong},
  booktitle={European Conference on Computer Vision},
  year={2022}
}
. Perceiving and interacting with 3D articulated objects, such as cabinets, doors, and faucets, pose particular challenges for future home-assistant robots performing daily tasks in human environments. Besides parsing the articulated parts and joint parameters, researchers recently advocate learning manipulation affordance over the input shape geometry which is more task-aware and geometrically fine-grained. How-ever, taking only passive observations as inputs, these methods ignore many hidden… 

End-to-End Affordance Learning for Robotic Manipulation

This study takes advantage of visual affordance by using the contact information generated during the RL training process to predict contact maps of interest, which leads to an end-to-end affordance learning framework that can generalize over different types of manipulation tasks.

Learning Agent-Aware Affordances for Closed-Loop Interaction with Articulated Objects

The concept of agent-aware affordances which fully reflect the agent’s capabilities and embodiment are introduced and it is shown that they outperform their state-of-the-art counterparts which are only conditioned on the end-effector geometry.

H-SAUR: Hypothesize, Simulate, Act, Update, and Repeat for Understanding Object Articulations from Interactions

This work proposes H-SAUR, a probabilistic generative framework that simultaneously generates a distribution of hypotheses about how objects articulate given input observations, captures certainty over hypotheses over time, and infer plausible actions for exploration and goal-conditioned manipulation in autonomous agents.

DualAfford: Learning Collaborative Visual Affordance for Dual-gripper Object Manipulation

This work proposes a novel learning framework, DualAfford, to learn collaborative affordance for dual-gripper manipulation tasks, to reduce the quadratic problem for two grippers into two disentangled yet inter-connected subtasks for efficient learning.

References

SHOWING 1-10 OF 46 REFERENCES

VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating 3D ARTiculated Objects

This paper proposes object-centric actionable visual priors as a novel perception-interaction handshaking point that the perception system outputs more actionable guidance than kinematic structure estimation, by predicting dense geometry-aware , interaction-aware, and task-aware visual action affordance and trajectory proposals.

Where2Act: From Pixels to Actions for Articulated 3D Objects

This paper proposes a learning-from-interaction framework with an online data sampling strategy that allows to train the network in simulation (SAPIEN) and generalizes across categories and proposes, discusses, and evaluates novel network architectures that given image and depth data, predict the set of actions possible at each pixel, and the regions over articulated parts that are likely to move under the force.

GanHand: Predicting Human Grasp Affordances in Multi-Object Scenes

A generative model is introduced that jointly reasons in all levels and refines the 51-DoF of a 3D hand model that minimize a graspability loss, and can robustly predict realistic grasps, even in cluttered scenes with multiple objects in close contact.

Learning Dexterous Grasping with Object-Centric Visual Affordances

This work proposes an approach for learning dexterous grasping that embeds an object-centric visual affordance model within a deep reinforcement learning loop to learn grasping policies that favor the same object regions favored by people.

Learning to Move with Affordance Maps

This paper designs an agent that learns to predict a spatial affordance map that elucidates what parts of a scene are navigable through active self-supervised experience gathering, and shows that learned affordance maps can be used to augment traditional approaches for both exploration and navigation, providing significant improvements in performance.

CPF: Learning a Contact Potential Field to Model the Hand-Object Interaction

An explicit contact representation namely Contact Potential Field (CPF), and a learning-fitting hybrid framework namely MIHO to Modeling the Interaction of Hand and Object are presented.

Deep part induction from articulated object pairs

This paper explores how the observation of different articulation states provides evidence for part structure and motion of 3D objects, and learns a neural network architecture with three modules that respectively propose correspondences, estimate 3D deformation flows, and perform segmentation.

S4G: Amodal Single-view Single-Shot SE(3) Grasp Detection in Cluttered Scenes

This paper studies the problem of 6-DoF grasping by a parallel gripper in a cluttered scene captured using a commodity depth sensor from a single viewpoint and proposes a single-shot grasp proposal network, trained with synthetic data and tested in real-world scenarios.

Synergies Between Affordance and Geometry: 6-DoF Grasp Detection via Implicit Representations

This work proposes to utilize the synergies between grasp affordance and 3D reconstruction through multi-task learning of a shared representation, which takes advantage of deep implicit functions, a continuous and memory-efficient representation, to enable differentiable training of both tasks.

O2O-Afford: Annotation-Free Large-Scale Object-Object Affordance Learning

By constructing four object-object interaction task environments using physical simulation (SAPIEN) and thousands of ShapeNet models with rich geometric diver-sity, this work is able to conduct large-scale object- object affordance learning without the need for human annotations or demonstrations.