NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

  title={NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis},
  author={Ben Mildenhall and Pratul P. Srinivasan and Matthew Tancik and Jonathan T. Barron and Ravi Ramamoorthi and Ren Ng},
We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views. Our algorithm represents a scene using a fully-connected (non-convolutional) deep network, whose input is a single continuous 5D coordinate (spatial location $(x,y,z)$ and viewing direction $(\theta, \phi)$) and whose output is the volume density and view-dependent emitted radiance at that… 

Voxurf: Voxel-based Efficient and Accurate Neural Surface Reconstruction

A voxel-based approach for neural surface reconstruction, which consists of two stages that leverage a learnable feature grid to construct the color and obtain a coherent coarse shape and detailed geometry with a dual color network that captures precise color-geometry dependency.

Neural Rendering in a Room: Amodal 3D Understanding and Free-Viewpoint Rendering for the Closed Scene Composed of Pre-Captured Objects

The experiments demonstrate that the two-stage design achieves robust 3D scene understanding and outperforms competing methods by a large margin, and it is shown that the realistic free-viewpoint rendering enables various applications, including scene touring and editing.

360Roam: Real-Time Indoor Roaming Using Geometry-Aware ${360^\circ}$ Radiance Fields

Object-centric, a scene- NeRF that large-scale scenes in real time and support VR roaming, builds an omnidi- rectional radiance field and an adaptive divide-and-conquer to fine-tune the radiance floorplan of the scene to facilitate a realistic roaming experience.

VolTeMorph: Realtime, Controllable and Generalisable Animation of Volumetric Representations

Fig. 1. We propose a method to deform static multi-view volumetric models, such as NeRF, in real-time using blendshape or physics-driven animation. This allows us to create dynamic scenes from static

End-to-end View Synthesis via NeRF Attention

The NeRF attention (NeRFA) is proposed, which considers the volumetric rendering equation as a soft feature modulation procedure and adopts the ray and pixel transformers to learn the interactions between rays and pixels.

PeRFception: Perception using Radiance Fields

This work creates the first large-scale implicit representation datasets for perception tasks, called the PeRFception dataset, which consists of two parts that incorporate both object-centric and scene-centric scans for classification and segmentation.

NeurAR: Neural Uncertainty for Autonomous 3D Reconstruction

This paper explores for the first time the possibility of using implicit neural representations for autonomous 3D scene reconstruction by addressing two key challenges: seeking a criterion to measure the quality of the candidate viewpoints for the view planning based on the new representations, and learning the criterion from data that can generalize to different scenes instead of hand-crafting one.

Generalizable Patch-Based Neural Rendering

This work proposes a different paradigm, where no deep visual features and no NeRF-like volume rendering are needed, and outperforms the state-of-the-art on novel view synthesis of unseen scenes even when being trained with considerably less data than prior work.

Supplementary Material for NeRFReN: Neural Radiance Fields with Reflections

  • Physics
  • 2022
An illustration of how λd and λbdc change with the training process is shown in Fig. 2. Note that we first increase λd to ensure correct geometry for the transmitted component and then increase λbdc

Neural Prior for Trajectory Estimation

This work proposes a neural trajectory prior to capture continuous spatio-temporal information without the need for offline data and demonstrates how the proposed objective is optimized during runtime to estimate trajectories for two important tasks: Non-Rigid Structure from Motion (NRSfM) and lidar scene flow integration for self-driving scenes.



Local light field fusion

An algorithm for view synthesis from an irregular grid of sampled views that first expands each sampled view into a local light field via a multiplane image (MPI) scene representation, then renders novel views by blending adjacent local light fields.

Reconstructing continuous distributions of 3D protein structure from cryo-EM images

The proposed method, termed cryoDRGN, is the first neural network-based approach for cryo-EM reconstruction and the first end-to-end method for directly reconstructing continuous ensembles of protein structures from cryo -EM images.

Neural volumes

This work presents a learning-based approach to representing dynamic objects inspired by the integral projection model used in tomographic imaging, and learns a latent representation of a dynamic scene that enables us to produce novel content sequences not seen during training.

DeepVoxels: Learning Persistent 3D Feature Embeddings

This work proposes DeepVoxels, a learned representation that encodes the view-dependent appearance of a 3D scene without having to explicitly model its geometry, based on a Cartesian 3D grid of persistent embedded features that learn to make use of the underlying3D scene structure.

Multilayer feedforward networks are universal approximators

Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains

An approach for selecting problem-specific Fourier features that greatly improves the performance of MLPs for low-dimensional regression tasks relevant to the computer vision and graphics communities is suggested.

Local Deep Implicit Functions for 3D Shape

Local Deep Implicit Functions (LDIF), a 3D shape representation that decomposes space into a structured set of learned implicit functions that provides higher surface reconstruction accuracy than the state-of-the-art (OccNet), while requiring fewer than 1\% of the network parameters.

Unified Neural Encoding of BTFs

A unified network architecture that is trained on a variety of materials, and which projects reflectance measurements to a shared latent parameter space is proposed, which is inspired by autoencoders and shows that the latent space is well‐behaved and can be sampled from.

Local Implicit Grid Representations for 3D Scenes

This paper introduces Local Implicit Grid Representations, a new 3D shape representation designed for scalability and generality and demonstrates the value of this proposed approach for 3D surface reconstruction from sparse point observations, showing significantly better results than alternative approaches.

Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision

This work proposes a differentiable rendering formulation for implicit shape and texture representations, showing that depth gradients can be derived analytically using the concept of implicit differentiation, and finds that this method can be used for multi-view 3D reconstruction, directly resulting in watertight meshes.