Extreme Relative Pose Network Under Hybrid Representations

  title={Extreme Relative Pose Network Under Hybrid Representations},
  author={Zhenpei Yang and Siming Yan and Qi-Xing Huang},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
In this paper, we introduce a novel RGB-D based relative pose estimation approach that is suitable for small-overlapping or non-overlapping scans and can output multiple relative poses. Our method performs scene completion and matches the completed scans. However, instead of using a fixed representation for completion, the key idea is to utilize hybrid representations that combine 360-image, 2D image-based layout, and planar patches. This approach offers adaptively feature representations for… 

Figures and Tables from this paper

Relative Pose Estimation for RGB-D Human Input Scans via Implicit Function Reconstruction

A novel end-to-end and coarse- to-fine optimization method which firstly combines implicit function reconstruction with differentiable render for RGB-D human input scans at arbitrary overlaps in relative pose estimation and outperforms considerably than standard pipelines in non-overlapping setups.

Extreme Rotation Estimation using Dense Correlation Volumes

This work presents a technique for estimating the relative 3D rotation of an RGB image pair in an extreme setting, where the images have little or no overlap, and proposes a network design that can automatically learn implicit cues, such as light source directions, vanishing points, and symmetries present in the scene.

Deep Confidence Guided Distance for 3D Partial Shape Registration

A novel non-iterative learnable method for partial-to-partial 3D shape registration that fuse learnable similarity between point embeddings and spatial distance between point clouds, inducing an optimized solution for the overlapping points while ignoring parts that only appear in one of the shapes.

Scene Synthesis via Uncertainty-Driven Attribute Synchronization

This paper introduces a novel neural scene synthesis approach that can capture diverse feature patterns of 3D scenes and uses the parametric prior distributions learned from training data to regularize the outputs of feed-forward neural models.

Global-Aware Registration of Less-Overlap RGB-D Scans

A reinforcement learning strategy is introduced to iteratively align RGB-D scans with thePanorama and re-fine the panorama representation, which reduces the noise of global information and preserves global consistency of both geometric and photometric alignments.

Virtual Correspondence: Humans as a Cue for Extreme-View Geometry

A method to find virtual correspondences based on humans in the scene is introduced and significantly outperforms state-of-the-art camera pose estimation methods in challenging scenarios and is comparable in the traditional densely captured setup.

Associative3D: Volumetric Reconstruction from Sparse Views

A new approach is proposed that estimates reconstructions, distributions over the camera/object and camera/camera transformations, as well as an inter-view object affinity matrix, and is jointly reasoned over to produce the most likely explanation of the scene.

The 8-Point Algorithm as an Inductive Bias for Relative Pose Prediction by ViTs

It is shown that a handful of modifications can be applied to a Vision Transformer (ViT) to bring its computations close to the Eight-Point Algorithm.

Extreme Structure from Motion for Indoor Panoramas without Visual Overlaps

An extreme Structure from Motion algorithm for residential indoor panoramas that have little to no visual overlaps is proposed and shows that an existing SfM approach completely fails for most of the houses.

HM3D-ABO: A Photo-realistic Dataset for Object-centric Multi-view 3D Reconstruction

This report presents a photo-realistic object-centric dataset HM3D-ABO, constructed by composing realistic indoor scene and realistic object and providing multi-view RGB observa-tions, a water-tight mesh model for the object, ground truth depth map and object mask.



Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion

This work introduces a novel approach that extends the scope to extreme relative poses, with little or even no overlap between the input scans, to infer more complete scene information about the underlying environment and match on the completed scans.

Floorplan-Jigsaw: Jointly Estimating Scene Layout and Aligning Partial Scans

Without the requirement of feature matching, the proposed method enables some novel applications ranging from large or featureless scene reconstruction and modeling from sparse input, and quantitatively and qualitatively on real and synthetic scenes of various sizes and complexities.

Semantic Scene Completion from a Single Depth Image

The semantic scene completion network (SSCNet) is introduced, an end-to-end 3D convolutional network that takes a single depth image as input and simultaneously outputs occupancy and semantic labels for all voxels in the camera view frustum.

PlaneMatch: Patch Coplanarity Prediction for Robust RGB-D Reconstruction

It is found that coplanarity constraints detected with the method are sufficient to get reconstruction results comparable to state-of-the-art frameworks on most scenes, but outperform other methods on standard benchmarks when combined with a simple keypoint method.

LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image

An algorithm to predict room layout from a single image that generalizes across panoramas and perspective images, cuboid layouts and more general layouts (e.g. "L"-shape room) is proposed, which achieves among the best accuracy for perspective images and can handle both cuboid-shaped and moregeneral Manhattan layouts.

Matterport3D: Learning from RGB-D Data in Indoor Environments

Matterport3D is introduced, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400RGB-D images of 90 building-scale scenes that enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification.

Relative Camera Pose Estimation Using Convolutional Neural Networks

This paper presents a convolutional neural network based approach for estimating the relative pose between two cameras. The proposed network takes RGB images from both cameras as input and directly

Super 4PCS Fast Global Pointcloud Registration via Smart Indexing

This work presents Super 4PCS for global pointcloud registration that is optimal, i.e., runs in linear time and is also output sensitive in the complexity of the alignment problem based on the (unknown) overlap across scan pairs.

4 PCS Fast Global Pointcloud Registration via Smart Indexing

S UPER 4PCS is presented for global pointcloud registration that runs in linear time and is also output sensitive in the complexity of the alignment problem based on the (unknown) overlap across scan pairs and allows unstructured efficient acquisition of scenes at scales previously not possible.

Sparse Iterative Closest Point

This work proposes a new formulation of the Iterative Closest Point algorithm that retains the simple structure of the ICP algorithm, while achieving superior registration results when dealing with outliers and incomplete data.