• Corpus ID: 240354495

Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation

  title={Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation},
  author={Xiaolong Li and Yijia Weng and Li Yi and Leonidas J. Guibas and A. Lynn Abbott and Shuran Song and He Wang},
  booktitle={Neural Information Processing Systems},
Category-level object pose estimation aims to find 6D object poses of previously unseen object instances from known categories without access to object CAD models. To reduce the huge amount of pose annotations needed for category-level learning, we propose for the first time a self-supervised learning framework to estimate category-level 6D object pose from single 3D point clouds. During training, our method assumes no ground-truth pose annotations, no CAD models, and no multi-view supervision… 

UDA-COPE: Unsupervised Domain Adaptation for Category-level Object Pose Estimation

This work proposes an unsupervised domain adaptation (UDA) for category-level object pose estimation, called UDA-COPE, which exploits a teacher-student self-supervised learning scheme to train a pose estimation network without using target domain pose labels.

Zero-Shot Category-Level Object Pose Estimation

This paper proposes a novel method based on semantic correspondences from a self-supervised vision transformer to solve the pose estimation problem, and extends much of the existing literature by removing the need for pose-labelled datasets or category-specific CAD models for training or inference.

Towards Self-Supervised Category-Level Object Pose and Size Estimation

This work proposes a label-free method that learns to enforce the geometric consistency between category template mesh and observed object point cloud under a self-supervision manner, and finds that it outperforms the simple traditional baseline by large margins while being competitive with some fully-supervised approaches.

ObPose: Leveraging Canonical Pose for Object-Centric Scene Inference in 3D

O B P OSE is presented, an unsupervised object-centric generative model that learns to segment 3D objects from RGB-D video in an un supervised manner and outperforms the current state-of-the-art in 3D scene inference (ObSuRF) by a significant margin.

Rotationally Equivariant 3D Object Detection

This work considers the object detection problem in 3D scenes, where an object bounding box should be equivariant regarding the object pose, independent of the scene motion, and proposes Equivariant Object detection Network (EON) with a rotation equivariance suspension design to achieve object-levelEquivariance.

ObPose: Leveraging Pose for Object-Centric Scene Inference in 3D

O B P OSE is evaluated quantitatively on the YCB and CLEVR datatasets for unsupervised scene segmentation, outperforming the current state-of-the-art in 3D scene inference (ObSuRF) by a significant margin.

SO(3)-Pose: SO(3)-Equivariance Learning for 6D Object Pose Estimation

Unlike most of existing pose estimation methods, the SO(3)-Pose not only implements the information communication between the RGB and depth channels, but also naturally absorbs the SO (3)-equivariance geometry knowledge from depth images, leading to better appearance and geometry representation learning.

Correct and Certify: A New Approach to Self-Supervised 3D-Object Perception

The proposed self-supervised training approach achieves performance comparable to fully supervised baselines while not requiring pose or keypoint supervision on real data.

Zero-Shot Category-Level Object Pose Estimation: Supplementary Material

In this appendix, we first discuss our choice of dataset, followed by our choice of evaluation categories and sequences, and a description of our pose-labelling procedure, and data pre-processing

Shape-Pose Disentanglement using SE(3)-equivariant Vector Neurons

An unsupervised technique for encoding point clouds into a canonical shape representation, by disentangling shape and pose is introduced, enabling the approach to focus on learning a consistent canonical pose for a class of objects.



Learning Canonical Shape Space for Category-Level 6D Object Pose and Size Estimation

This work trains a variational auto-encoder (VAE) for generating 3D point clouds in the canonical space from an RGBD image, and integrates the learning of CASS and pose and size estimation into an end-to-end trainable network, achieving the state-of-the-art performance.

Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation

The proposed method is able to robustly estimate the pose and size of unseen object instances in real environments while also achieving state-of-the-art performance on standard 6D pose estimation benchmarks.

Augmented Autoencoders: Implicit 3D Orientation Learning for 6D Object Detection

This novel 3D orientation estimation is based on a variant of the Denoising Autoencoder that is trained on simulated views of a 3D model using Domain Randomization and achieves state-of-the-art performance on the T-LESS dataset both in the RGB and RGB-D domain.

Shape Prior Deformation for Categorical 6D Object Pose and Size Estimation

A deep network is proposed to reconstruct the 3D object model by explicitly modeling the deformation from a pre-learned categorical shape prior by infers the dense correspondences between the depth observation of the object instance and the reconstructed 3D model to jointly estimate the 6D object pose and size.

Category Level Object Pose Estimation via Neural Analysis-by-Synthesis

This paper combines a gradient-based fitting procedure with a parametric neural image synthesis module that is capable of implicitly representing the appearance, shape and pose of entire object categories, thus rendering the need for explicit CAD models per object instance unnecessary.

Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction

This work presents a framework for learning single-view shape and pose prediction without using direct supervision for either, and demonstrates the applicability of the framework in a realistic setting which is beyond the scope of existing techniques.

Self6D: Self-Supervised Monocular 6D Object Pose Estimation

This work proposes the idea of monocular 6D pose estimation by means of self-supervised learning, removing the need for real annotations, and demonstrates that the proposed self- supervision model is able to significantly enhance the model's original performance, outperforming all other methods relying on synthetic data or employing elaborate techniques from the domain adaptation realm.

PoseCNN: A Convolutional Neural Network for 6D Object Pose Estimation in Cluttered Scenes

This work introduces PoseCNN, a new Convolutional Neural Network for 6D object pose estimation, which is highly robust to occlusions, can handle symmetric objects, and provide accurate pose estimation using only color images as input.

6-PACK: Category-level 6D Pose Tracker with Anchor-Based Keypoints

6-PACK learns to compactly represent an object by a handful of 3D keypoints, based on which the interframe motion of an object instance can be estimated through keypoint matching, and substantially outperforms existing methods on the NOCS category-level 6D pose estimation benchmark.

Domain Transfer for 3D Pose Estimation from Color Images without Manual Annotations

A novel learning method for 3D pose estimation from color images that achieves performances comparable to state-of-the-art methods on popular benchmark datasets, without requiring any annotations for the color images.