Volumetric Disentanglement for 3D Scene Manipulation

  title={Volumetric Disentanglement for 3D Scene Manipulation},
  author={Sagie Benaim and Frederik Warburg and Peter Ebert Christensen and Serge J. Belongie},
. Recently, advances in differential volumetric rendering enabled significant breakthroughs in the photo-realistic and fine-detailed reconstruction of complex 3D scenes, which is key for many virtual reality applications. However, in the context of augmented reality, one may also wish to effect semantic manipulations or augmentations of objects within a scene. To this end, we propose a volumetric framework for (i) disentangling or separating, the volumetric representation of a given foreground… 

Figures and Tables from this paper

Decomposing NeRF for Editing via Feature Field Distillation

This work proposes to distill the knowledge of off-the-shelf, supervised and self-supervised 2D image feature extractors into a 3D feature optimized in parallel to the radiance, enabling query-based local editing of the represented 3D scenes.

K-Planes: Explicit Radiance Fields in Space, Time, and Appearance

A linear feature decoder with a learned color basis that yields similar performance as a nonlinear black-box MLP decoder is used, which induces a natural decomposition of static and dynamic components of a scene.

ClimateNeRF: Physically-based Neural Rendering for Extreme Climate Synthesis

Physical simulations produce excellent predictions of weather effects. Neural radiance fields produce SOTA scene models. We describe a novel NeRF-editing procedure that can fuse physical simulations



NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis

We present a method that takes as input a set of images of a scene illuminated by unconstrained known lighting, and produces as output a 3D representation that can be rendered from novel viewpoints

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

This work describes how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrates results that outperform prior work on neural rendering and view synthesis.

PIFu: Pixel-Aligned Implicit Function for High-Resolution Clothed Human Digitization

The proposed Pixel-aligned Implicit Function (PIFu), an implicit representation that locally aligns pixels of 2D images with the global context of their corresponding 3D object, achieves state-of-the-art performance on a public benchmark and outperforms the prior work for clothed human digitization from a single image.

GIRAFFE: Representing Scenes as Compositional Generative Neural Feature Fields

The key hypothesis is that incorporating a compositional 3D scene representation into the generative model leads to more controllable image synthesis and a fast and realistic image synthesis model is proposed.

Stay Positive: Non-Negative Image Synthesis for Augmented Reality

A novel optimization procedure is proposed to produce images that satisfy both semantic and non-negativity constraints and can incorporate existing state-of-the-art methods, and exhibits strong performance in a variety of tasks including image-to-image translation and style transfer.

Editing Conditional Radiance Fields

This paper introduces a method for propagating coarse 2D user scribbles to the 3D space, to modify the color or shape of a local region, and proposes a conditional radiance field that incorporates new modular network components, including a shape branch that is shared across object instances.

Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics

This work presents Occupancy Flow, a novel spatio-temporal representation of time-varying 3D geometry with implicit correspondences which can be used for interpolation and reconstruction tasks, and believes that Occupancy flow is a promising new 4D representation which will be useful for a variety of spatio/temporal reconstruction tasks.

Text and Image Guided 3D Avatar Generation and Manipulation

This work proposes a novel 3D manipulation method that can manipulate both the shape and texture of the model using text or image-based prompts such as 'a young face' or 'a surprised face', and uses the power of Contrastive Language-Image Pre-training (CLIP) model and a pre-trained 3D GAN model to create a fully differentiable rendering pipeline to manipulate meshes.

DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation

This work introduces DeepSDF, a learned continuous Signed Distance Function (SDF) representation of a class of shapes that enables high quality shape representation, interpolation and completion from partial and noisy 3D input data.

NeRD: Neural Reflectance Decomposition from Image Collections

A neural reflectance decomposition (NeRD) technique that uses physically-based rendering to decompose the scene into spatially varying BRDF material properties enabling fast real-time rendering with novel illuminations.