Neural RGB-D Surface Reconstruction

  title={Neural RGB-D Surface Reconstruction},
  author={Dejan Azinovi'c and Ricardo Martin-Brualla and Dan B. Goldman and Matthias Nie{\ss}ner and Justus Thies},
  journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
Obtaining high-quality 3D reconstructions of room-scale scenes is of paramount importance for upcoming applications in AR or VR. These range from mixed reality applications for teleconferencing, virtual measuring, virtual room planing, to robotic applications. While current volume-based view synthesis methods that use neural radiance fields (NeRFs) show promising results in reproducing the appearance of an object or scene, they do not reconstruct an actual surface. The volumetric representation… 

RGB-Only Reconstruction of Tabletop Scenes for Collision-Free Manipulator Control

A system for collision-free control of a robot manipulator that uses only RGB views of the world to reach a desired pose while avoiding obstacles in the ESDF is presented.

iDF-SLAM: End-to-End RGB-D SLAM with Neural Implicit Mapping and Deep Feature Tracking

A novel end-to-end RGB-D SLAM, iDF-SLAM, which adopts a feature-based deep neural tracker as the front-end and a NeRF-style neural implicit mapper as the back-end, enabling lifelong learning of the SLAM system.

GO-Surf: Neural Feature Grid Optimization for Fast, High-Fidelity RGB-D Surface Reconstruction

GO-Surf is presented, a direct feature grid optimization method for accurate and fast surface reconstruction from RGB-D sequences that can optimize sequences of 1 - 2 K frames in 15 - 45 minutes, a speedup of × 60 over NeuralRGB-D, the most related approach based on an MLP representation, while maintaining on par performance on standard benchmarks.

What's the Situation with Intelligent Mesh Generation: A Survey and Perspectives

Focusing on 110 preliminary IMG methods, an in-depth analysis and evaluation from multiple perspectives is conducted, including the core technique and application scope of the algorithm, agent learning goals, data types, targeting challenges, advantages and limitations.

Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit Representation

This work presents a dense tracking and mapping system named Vox-Fusion, which seamlessly fuses neural implicit representations with traditional volumetric fusion methods, and leverage a voxel-based neural implicit surface representation to encode and optimize the scene inside each voxels.

SHINE-Mapping: Large-Scale 3D Mapping Using Sparse Hierarchical Implicit Neural Representations

This paper addresses the problems of achieving large-scale 3D reconstructions with implicit representations using 3D LiDAR measurements, and designs an incremental mapping system with regularization to tackle the issue of catastrophic forgetting in continual learning.

BNV-Fusion: Dense 3D Reconstruction using Bi-level Neural Volume Fusion

This work proposes a novel bi-level fusion strategy that considers both efficiency and reconstruction quality by design, and evaluates the proposed method on multiple datasets quantitatively and qualitatively, demonstrating a significant improvement over existing methods.

BS3D: Building-scale 3D Reconstruction from RGB-D Images

An easy-to-use framework for acquiring building-scale 3D reconstruction using a consumer depth camera, which enables crowd-sourcing and utilizes raw depth maps for odometry computation and loop closure refinement which results in better reconstructions.

Mixed Reality Communication for Medical Procedures: Teaching the Placement of a Central Venous Catheter

The results indicate that the mixed reality real-time communication system enhances and offers new possibilities for visual communication compared to video teleconference-based training and to improve remote emergency assistance.

Incremental Learning for Neural Radiance Field with Uncertainty-Filtered Knowledge Distillation

A student-teacher pipeline is proposed to mitigate the catastrophic forgetting problem when continuously learning from streaming data without revisiting the previous training data and a random inquirer and an uncertainty-based to help retain old knowledge from the teacher network simultaneously.



NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

This work describes how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrates results that outperform prior work on neural rendering and view synthesis.

Marching cubes: A high resolution 3D surface construction algorithm

We present a new algorithm, called marching cubes, that creates triangle models of constant density surfaces from 3D medical data. Using a divide-and-conquer approach to generate inter-slice

ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes

This work introduces ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations, and shows that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks.

BundleFusion: real-time globally consistent 3D reconstruction using on-the-fly surface re-integration

This work systematically addresses issues with a novel, real-time, end-to-end reconstruction framework, which outperforms state-of-the-art online systems with quality on par to offline methods, but with unprecedented speed and scan completeness.

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

Experiments show that NeuS outperforms the state-of-the-arts in high-quality surface reconstruction, especially for objects and scenes with complex structures and self-occlusion, even for highly complex objects.

Convolutional Occupancy Networks

Convolutional Occupancy Networks is proposed, a more flexible implicit representation for detailed reconstruction of objects and 3D scenes that enables the fine-grained implicit 3D reconstruction of single objects, scales to large indoor scenes, and generalizes well from synthetic to real data.

A volumetric method for building complex models from range images

This paper presents a volumetric method for integrating range images that is able to integrate a large number of range images yielding seamless, high-detail models of up to 2.6 million triangles.

Plenoxels: Radiance Fields without Neural Networks

This work introduces Plenoxels (plenoptic voxels), a system for photorealistic view synthesis that can be optimized from calibrated images via gradient methods and regularization without any neural components.

Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction

A super-fast convergence approach to reconstructing the per-scene radiance field from a set of images that capture the scene with known poses, which matches, if not surpasses, NeRF's quality, yet it only takes about 15 minutes to train from scratch for a new scene.

Panoptic 3D Scene Reconstruction From a Single RGB Image

This work proposes a new approach for holistic 3D scene understanding from a single RGB image which learns to lift and propagate 2D features from an input image to a 3D volumetric scene representation and demonstrates that this holistic view of joint scene reconstruction, semantic, and instance segmentation is beneficial over treating the tasks independently, thus outperforming alternative approaches.