IBRNet: Learning Multi-View Image-Based Rendering

@article{Wang2021IBRNetLM,
  title={IBRNet: Learning Multi-View Image-Based Rendering},
  author={Qianqian Wang and Zhicheng Wang and Kyle Genova and Pratul P. Srinivasan and Howard Zhou and Jonathan T. Barron and Ricardo Martin-Brualla and Noah Snavely and Thomas A. Funkhouser},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021},
  pages={4688-4697}
}
We present a method that synthesizes novel views of complex scenes by interpolating a sparse set of nearby views. The core of our method is a network architecture that includes a multilayer perceptron and a ray transformer that estimates radiance and volume density at continuous 5D locations (3D spatial locations and 2D viewing directions), drawing appearance information on the fly from multiple source views. By drawing on source views at render time, our method hearkens back to classic work on… 

Figures and Tables from this paper

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes
TLDR
Stereo Radiance Fields is introduced, a neural view synthesis approach that is trained end-to-end, generalizes to new scenes, and requires only sparse views at test time, andExperiments show that SRF learns structure instead of over-fitting on a scene, achieving significantly sharper, more detailed results than scene-specific models.
Baking Neural Radiance Fields for Real-Time View Synthesis
TLDR
A method to train a NeRF, then precompute and store it as a novel representation called a Sparse Neural Radiance Grid (SNeRG) that enables real-time rendering on commodity hardware and retains NeRF’s ability to render fine geometric details and view-dependent appearance.
MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo
TLDR
This work proposes a generic deep neural network that can reconstruct radiance fields from only three nearby input views via fast network inference, and leverages plane-swept cost volumes for geometry-aware scene reasoning, and combines this with physically based volume rendering for neural radiance field reconstruction.
Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis
TLDR
DietNeRF improves the perceptual quality of few-shot view synthesis when learned from scratch, can render novel views with as few as one observed image when pre-trained on a multi-view dataset, and produces plausible completions of completely unobserved regions.
PlenOctrees for Real-time Rendering of Neural Radiance Fields
TLDR
It is shown that it is possible to train NeRFs to predict a spherical harmonic representation of radiance, removing the viewing direction as an input to the neural network, and PlenOctrees can be directly optimized to further minimize the reconstruction loss, which leads to equal or better quality compared to competing methods.
Unsupervised Learning of 3D Object Categories from Videos in the Wild
TLDR
A new neural network design is proposed, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction while obtaining a detailed implicit representation of the object surface and texture, also compensating for the noise in the initial SfM reconstruction that bootstrapped the learning process.
GNeRF: GAN-based Neural Radiance Field without Posed Camera
TLDR
GNeRF, a framework to marry Generative Adversarial Networks (GAN) with Neural Radiance Field reconstruction for the complex scenarios with unknown and even randomly initialized camera poses, is introduced and outperforms the baselines favorably in those scenes with repeated patterns or low textures that are regarded as extremely challenging before.
KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
TLDR
It is demonstrated that real-time rendering is possible by utilizing thousands of tiny MLPs instead of one single large MLP, and using teacher-student distillation for training, this speed-up can be achieved without sacrificing visual quality.
EfficientNeRF: Efficient Neural Radiance Fields
TLDR
Experiments prove that the EfficientNeRF method promotes the practical-ity of NeRF in the real world and enables many applications and design a novel data structure to cache the whole scene during testing to accelerate the rendering speed.
Stereo Magnification with Multi-Layer Images
TLDR
This work introduces a new view synthesis approach based on multiple semitransparent layers with scene-adapted geometry that outperforms the recently proposed IBRNet system based on implicit geometry representation.
...
...

References

SHOWING 1-10 OF 79 REFERENCES
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
TLDR
This work describes how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrates results that outperform prior work on neural rendering and view synthesis.
Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer
TLDR
A differentiable rendering framework which allows gradients to be analytically computed for all pixels in an image and to view foreground rasterization as a weighted interpolation of local properties and background rasterized as a distance-based aggregation of global geometry.
Soft Rasterizer: A Differentiable Renderer for Image-Based 3D Reasoning
TLDR
This work proposes a truly differentiable rendering framework that is able to directly render colorized mesh using differentiable functions and back-propagate efficient supervision signals to mesh vertices and their attributes from various forms of image representations, including silhouette, shading and color images.
Deferred Neural Rendering: Image Synthesis using Neural Textures
TLDR
This work proposes Neural Textures, which are learned feature maps that are trained as part of the scene capture process that can be utilized to coherently re-render or manipulate existing video content in both static and dynamic environments at real-time rates.
Deep blending for free-viewpoint image-based rendering
TLDR
This work presents a new deep learning approach to blending for IBR, in which held-out real image data is used to learn blending weights to combine input photo contributions, and designs the network architecture and the training loss to provide high quality novel view synthesis, while reducing temporal flickering artifacts.
Stereo Magnification: Learning View Synthesis using Multiplane Images
TLDR
This paper explores an intriguing scenario for view synthesis: extrapolating views from imagery captured by narrow-baseline stereo cameras, including VR cameras and now-widespread dual-lens camera phones, and proposes a learning framework that leverages a new layered representation that is called multiplane images (MPIs).
Free View Synthesis
TLDR
This work presents a method for novel view synthesis from input images that are freely distributed around a scene that can synthesize images for free camera movement through the scene, and works for general scenes with unconstrained geometric layouts.
Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines
TLDR
An algorithm for view synthesis from an irregular grid of sampled views that first expands each sampled view into a local light field via a multiplane image (MPI) scene representation, then renders novel views by blending adjacent local light fields.
Neural Sparse Voxel Fields
TLDR
This work introduces Neural Sparse Voxel Fields (NSVF), a new neural scene representation for fast and high-quality free-viewpoint rendering that is over 10 times faster than the state-of-the-art (namely, NeRF) at inference time while achieving higher quality results.
GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis
TLDR
This paper proposes a generative model for radiance fields which have recently proven successful for novel view synthesis of a single scene, and introduces a multi-scale patch-based discriminator to demonstrate synthesis of high-resolution images while training the model from unposed 2D images alone.
...
...