Skeleton Merger: an Unsupervised Aligned Keypoint Detector

  title={Skeleton Merger: an Unsupervised Aligned Keypoint Detector},
  author={Ruoxi Shi and Zhengrong Xue and Yang You and Cewu Lu},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Ruoxi Shi, Zhengrong Xue, Cewu Lu
  • Published 19 March 2021
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Detecting aligned 3D keypoints is essential under many scenarios such as object tracking, shape retrieval and robotics. However, it is generally hard to prepare a high-quality dataset for all types of objects due to the ambiguity of keypoint itself. Meanwhile, current unsupervised detectors are unable to generate aligned keypoints with good coverage. In this paper, we propose an unsupervised aligned keypoint detector, Skeleton Merger, which utilizes skeletons to reconstruct objects. It is based… 

Unsupervised Learning of 3D Semantic Keypoints with Mutual Reconstruction

The proposed method is the first to mine 3D semantic consistent keypoints from a mutual reconstruction view and predicts keypoints that not only reconstruct the object itself but also reconstruct other instances in the same category.

LAKe-Net: Topology-Aware Point Cloud Completion by Localizing Aligned Keypoints

LAKe-Net is proposed, a novel topology-aware point cloud completion model by localizing aligned keypoints, with a novel Keypoints-Skeleton-Shape prediction manner, and experimental results show that the method achieves the state-of-the-art performance on point cloudpletion.

Object Wake-up: 3D Object Rigging from a Single Image

This work proposes an automated approach to build such 3D generic objects from single images and embed articulated skeletons in them and develops a novel skeleton prediction method with a multi-head structure for skeleton probability field estimation by utilizing the deep implicit functions.

Object Wake-up: 3D Object Reconstruction and Animation from a Single Image

An automated approach to tackle the entire process of reconstruct such 3D generic objects from single images, rigging and animation, which goes beyond 2D manipulation and leads to greater flexibility in terms of feasible object motions.

A Convolutional Neural-Network-Based Training Model to Estimate Actual Distance of Persons in Continuous Images

This study proposes a training model based on a convolutional neural network, which uses a single-lens camera to estimate humans’ distance in continuous images and can partially restore depth information loss using built-in camera parameters that do not require additional correction.

Localization with Sampling-Argmax

Soft-argmax operation is commonly adopted in detection-based methods to localize the target position in a differentiable manner. However, training the neural network with soft-argmax makes the shape

Skeleton-free Pose Transfer for Stylized 3D Characters

This work presents the first method that automatically transfers poses between stylized 3D characters without skeletal rigging, and proposes a novel pose transfer network that predicts the character skinning weights and deformation transformations jointly to articulate the target character to match the desired pose.

USEEK: Unsupervised SE(3)-Equivariant 3D Keypoints for Generalizable Manipulation

This paper uses USEEK, an unsupervised SE(3)-equivariant keypoints method that enjoys alignment across instances in a category, to perform generalizable manipulation, and demonstrates that the keypoints produced by USEEK possess rich semantics, thus successfully transferring the functional knowledge from the demonstration object to the novel ones.



USIP: Unsupervised Stable Interest Point Detection From 3D Point Clouds

  • Jiaxin LiGim Hee Lee
  • Computer Science, Environmental Science
    2019 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2019
The USIP detector is an Unsupervised Stable Interest Point detector that can detect highly repeatable and accurately localized keypoints from 3D point clouds under arbitrary transformations without the need for any ground truth training data.

3D Point Cloud Registration for Localization Using a Deep Neural Network Auto-Encoder

We present an algorithm for registration between a large-scale point cloud and a close-proximity scanned point cloud, providing a localization solution that is fully independent of prior information

Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from Point Sets

This paper aims at learning category-specific 3D keypoints, in an unsupervised manner, using a collection of misaligned 3D point clouds of objects from an unknown category, using the symmetric linear basis shapes without assuming the plane of symmetry to be known.

Object Skeleton Extraction in Natural Images by Fusing Scale-Associated Deep Side Outputs

A fully convolutional network with multiple scale-associated side outputs is presented to address object skeleton extraction in natural images, and achieves promising results on two skeleton extraction datasets, and significantly outperforms other competitors.

Multi-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge

This paper proposes a self-supervised method to generate a large labeled dataset without tedious manual segmentation and demonstrates that the system can reliably estimate the 6D pose of objects under a variety of scenarios.

Learning 3D Keypoint Descriptors for Non-rigid Shape Matching

A novel deep learning framework that derives discriminative local descriptors for 3D surface shapes by leveraging a triplet network to perform deep metric learning, which takes a set of triplets as input and is minimized to distinguish between similar and dissimilar pairs of keypoints.

PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation

This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing.

KeypointNet: A Large-Scale 3D Keypoint Dataset Aggregated From Numerous Human Annotations

This work presents KeypointNet: the first large-scale and diverse 3D keypoint dataset that contains 83,231 keypoints and 8,329 3D models from 16 object categories, by leveraging numerous human annotations.

RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints

A Convolutional Neural Network (CNN)-based model "RotationNet," which takes multi-view images of an object as input and jointly estimates its pose and object category, and achieves the state-of-the-art performance on an object pose estimation dataset.

Reconstruction and Analysis of 3D Scenes

  • M. Weinmann
  • Environmental Science, Computer Science
    Springer International Publishing
  • 2016
A novel and fully automated framework involving a variety of components that allow an efficient reconstruction and analysis of large 3D environments up to city scale and offer a great potential for future research is introduced.