Neural Radiance Flow for 4D View Synthesis and Video Processing

@article{Du2021NeuralRF,
  title={Neural Radiance Flow for 4D View Synthesis and Video Processing},
  author={Yilun Du and Yinan Zhang and Hong-Xing Yu and Joshua B. Tenenbaum and Jiajun Wu},
  journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2021},
  pages={14304-14314}
}
  • Yilun Du, Yinan Zhang, Jiajun Wu
  • Published 17 December 2020
  • Computer Science
  • 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
We present a method, Neural Radiance Flow (NeRFlow), to learn a 4D spatial-temporal representation of a dynamic scene from a set of RGB images. Key to our approach is the use of a neural implicit representation that learns to capture the 3D occupancy, radiance, and dynamics of the scene. By enforcing consistency across different modalities, our representation enables multi-view rendering in diverse dynamic scenes, including water pouring, robotic interaction, and real images, outperforming… 
Light Field Neural Rendering
TLDR
A two-stage transformer-based model that aggregates features along epipolar lines, then aggregates Features along reference views to produce the color of a target ray in order to represent view-dependent effects accurately.
Revealing Occlusions with 4D Neural Fields
TLDR
A framework for learning to estimate 4D visual representations from monocular RGB-D video, which is able to persist objects, even once they become obstructed by occlusions, is introduced.
GANcraft: Unsupervised 3D Neural Rendering of Minecraft Worlds
TLDR
GANcraft is presented, an unsupervised neural rendering framework for generating photorealistic images of large 3D block worlds such as those created in Minecraft, and allows user control over both scene semantics and output style.
Animatable Neural Radiance Fields from Monocular RGB-D
TLDR
This paper introduces a novel method to integrate observations across frames and encode the appearance at each individual frame by utilizing the human pose that models the body shape and point clouds which cover partial part of the human as the input.
Mixture of volumetric primitives for efficient neural rendering
TLDR
Mixture of Volumetric Primitives (MVP), a representation for rendering dynamic 3D content that combines the completeness of volumetric representations with the efficiency of primitive-based rendering, is presented.
Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation
TLDR
Panoptic Neural Fields is presented, an object-aware neural scene representation that decomposes a scene into a set of objects (things) and background (stuff) that can be smaller and faster than previousobject-aware approaches, while still leveraging category-specific priors incorporated via meta-learned initialization.
Solving Inverse Problems with NerfGANs
TLDR
A novel radiance field regularization method to obtain better 3-D surfaces and improved novel views given single view observations and naturally extends to general inverse problems including inpainting where one observes only partially a single view.
Surface-Aligned Neural Radiance Fields for Controllable 3D Human Synthesis
TLDR
A new method for reconstructing controllable implicit 3D human models from sparse multi-view RGB videos using a barycentric interpolation with modi-fied vertex normals that achieves a higher quality in a novel-view and novel-pose synthesis than existing methods.
3D Neural Scene Representations for Visuomotor Control
TLDR
This work shows that a dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks involving both rigid bodies and fluids, where the target is specified in a viewpoint different from what the robot operates on.
3D Neural Scene Representations for Visuomotor Control
TLDR
This work shows that a dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks involving both rigid bodies and fluids, where the target is specified in a viewpoint different from what the robot operates on.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 88 REFERENCES
D-NeRF: Neural Radiance Fields for Dynamic Scenes
TLDR
D-NeRF is introduced, a method that extends neural radiance fields to a dynamic domain, allowing to reconstruct and render novel images of objects under rigid and non-rigid motions from a single camera moving around the scene.
View Synthesis by Appearance Flow
TLDR
This work addresses the problem of novel view synthesis: given an input image, synthesizing new images of the same object or scene observed from arbitrary viewpoints and shows that for both objects and scenes, this approach is able to synthesize novel views of higher perceptual quality than previous CNN-based techniques.
High-quality video view interpolation using a layered representation
TLDR
This paper shows how high-quality video-based rendering of dynamic scenes can be accomplished using multiple synchronized video streams combined with novel image-based modeling and rendering algorithms, and develops a novel temporal two-layer compressed representation that handles matting.
DeepView: View Synthesis With Learned Gradient Descent
TLDR
This work presents a novel approach to view synthesis using multiplane images (MPIs) that incorporates occlusion reasoning, improving performance on challenging scene features such as object boundaries, lighting reflections, thin structures, and scenes with high depth complexity.
Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics
TLDR
This work presents Occupancy Flow, a novel spatio-temporal representation of time-varying 3D geometry with implicit correspondences which can be used for interpolation and reconstruction tasks, and believes that Occupancy flow is a promising new 4D representation which will be useful for a variety of spatio/temporal reconstruction tasks.
Learning-based view synthesis for light field cameras
TLDR
This paper proposes a novel learning-based approach to synthesize new views from a sparse set of input views that could potentially decrease the required angular resolution of consumer light field cameras, which allows their spatial resolution to increase.
Virtual video camera: image-based viewpoint navigation through space and time
TLDR
This work presents an image-based rendering system to viewpoint-navigate through space and time of complex real-world, dynamic scenes, treating view interpolation uniformly inspace and time, and shows how spatial viewpoint navigation, slow motion, and freeze-and-rotate effects can all be created in the same fashion.
X-Fields: Implicit Neural View-, Light- and Time-Image Interpolation
TLDR
The key idea to make this workable is a NN that already knows the "basic tricks" of graphics in a hard-coded and differentiable form, leading to a compact set of trainable parameters and hence real-time navigation in view, time and illumination.
Deformable Neural Radiance Fields
TLDR
D-NeRF can turn casually captured selfie photos/videos into deformable NeRF models that allow for photorealistic renderings of the subject from arbitrary viewpoints, which are dubbed "nerfies".
Deep blending for free-viewpoint image-based rendering
TLDR
This work presents a new deep learning approach to blending for IBR, in which held-out real image data is used to learn blending weights to combine input photo contributions, and designs the network architecture and the training loss to provide high quality novel view synthesis, while reducing temporal flickering artifacts.
...
1
2
3
4
5
...