LOLNeRF: Learn from One Look

  title={LOLNeRF: Learn from One Look},
  author={Daniel Rebain and Mark J. Matthews and Kwang Moo Yi and Dmitry Lagun and Andrea Tagliasacchi},
  journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
We present a method for learning a generative 3D model based on neural radiance fields, trained solely from data with only single views of each object. While generating realistic images is no longer a difficult task, producing the corresponding 3D structure such that they can be rendered from different views is non-trivial. We show that, unlike existing methods, one does not need multi-view data to achieve this goal. Specifically, we show that by reconstructing many images aligned to an… 

VQ3D: Learning a 3D-Aware Generative Model on ImageNet

This work presents VQ3D, which introduces a NeRF-based decoder into a two-stage vector-quantized autoencoder, which is capable of generating and reconstructing 3D-aware images from the 1000-class ImageNet dataset of 1.2 million training images.

Single-Stage Diffusion NeRF: A Unified Approach to 3D Generation and Reconstruction

SSDNeRF is presented, a unified approach that employs an expressive diffusion model to learn a generalizable prior of neural radiance fields (NeRF) from multi-view images of diverse objects to enable simultaneous 3D reconstruction and prior learning, even from sparsely available views.

Shape, Pose, and Appearance from a Single Image via Bootstrapped Radiance Field Inversion

This work introduces a principled end-to-end reconstruction framework for natural images, where accurate ground-truth poses are not available, and leverages an unconditional 3D-aware generator, to which the model produces a first guess of the solution which is then refined via optimization.

TEGLO: High Fidelity Canonical Texture Mapping from Single-View Images

This work proposes TEGLO (Textured EG3D-GLO) for learning 3D representations from single view in-the-wild image collections for a given class of objects by training a conditional Neural Radiance Field without any explicit 3D supervision and demonstrates that such mapping enables texture transfer and texture editing without requiring meshes with shared topology.

Real-Time Radiance Fields for Single-Image Portrait View Synthesis

This work presents a one-shot method to infer and render a photorealistic 3D representation from a single unposed image (e.g., face portrait) in real-time and benchmark against the state-of-the-art methods.

RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild

A data-driven method for inferring the camera viewpoints given multiple images of an arbitrary object, using an energy-based formulation for representing distributions over relative camera rotations, which outperforms state-of-the-art SfM and SLAM methods given sparse images on both seen and unseen categories.

Deep Generative Models on 3D Representations: A Survey

A thorough review of the development of 3D generation, including 3D shape generation and 3D-aware image synthesis, from the perspectives of both algorithms and more importantly representations is made.

NeRF, meet differential geometry!

This work shows how a direct mathematical formalism of previously proposed NeRF variants aimed at improving the performance in challenging conditions can be used to natively encourage the regularity of surfaces (by means of Gaussian and Mean Curvatures) making it possible, for example, to learn surfaces from a very limited number of views.

Seeing 3D Objects in a Single Image via Self-Supervised Static-Dynamic Disentanglement

This work proposes an approach that observes unlabeled multiview videos at training time and learns to map a single image observation of a complex scene to a 3D neural scene representation that is disentangled into movable and immovable parts while plausibly completing its 3D structure.

Neural Groundplans: Persistent Neural Scene Representations from a Single Image

The ability to separately reconstruct movable objects enables a variety of downstream tasks using simple heuristics, such as extraction of object-centric 3D representations, novel view synthesis, instance-level segmentation, 3D bounding box prediction, and scene editing.

pixelNeRF: Neural Radiance Fields from One or Few Images

We propose pixelNeRF, a learning framework that predicts a continuous neural scene representation conditioned on one or few input images. The existing approach for constructing neural radiance fields

Unsupervised Learning of 3D Object Categories from Videos in the Wild

A new neural network design is proposed, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction while obtaining a detailed implicit representation of the object surface and texture, also compensating for the noise in the initial SfM reconstruction that bootstrapped the learning process.

Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild

We propose a method to learn 3D deformable object categories from raw single-view images, without external supervision. The method is based on an autoencoder that factors each input image into depth,

HoloGAN: Unsupervised Learning of 3D Representations From Natural Images

HoloGAN is the first generative model that learns 3D representations from natural images in an entirely unsupervised manner and is shown to be able to generate images with similar or higher visual quality than other generative models.

CodeNeRF: Disentangled Neural Radiance Fields for Object Categories

  • W. JangL. Agapito
  • Computer Science
    2021 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2021
Results on real-world images demonstrate that CodeNeRF can bridge the sim-to-real gap and generalises well to unseen objects and achieves on-par performance with methods that require known camera pose at test time.

GRAF: Generative Radiance Fields for 3D-Aware Image Synthesis

This paper proposes a generative model for radiance fields which have recently proven successful for novel view synthesis of a single scene, and introduces a multi-scale patch-based discriminator to demonstrate synthesis of high-resolution images while training the model from unposed 2D images alone.

Occupancy Networks: Learning 3D Reconstruction in Function Space

This paper proposes Occupancy Networks, a new representation for learning-based 3D reconstruction methods that encodes a description of the 3D output at infinite resolution without excessive memory footprint, and validate that the representation can efficiently encode 3D structure and can be inferred from various kinds of input.

GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering

The key to the approach is to explicitly integrate the principle of multi- view geometry to obtain the internal representations from observed 2D views, guaranteeing the learned implicit representations meaningful and multi-view consistent.

NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

This work describes how to effectively optimize neural radiance fields to render photorealistic novel views of scenes with complicated geometry and appearance, and demonstrates results that outperform prior work on neural rendering and view synthesis.

UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction

This work shows that implicit surface models and radiance fields can be formulated in a unified way, enabling both surface and volume rendering using the same model, and outperforms NeRF in terms of reconstruction quality while performing on par with IDR without requiring masks.