Deep Optimized Priors for 3D Shape Modeling and Reconstruction

@article{Yang2021DeepOP,
  title={Deep Optimized Priors for 3D Shape Modeling and Reconstruction},
  author={Mingyue Yang and Yuxin Wen and Weikai Chen and Yong Song Chen and Kui Jia},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2021},
  pages={3268-3277}
}
  • Mingyue Yang, Yuxin Wen, K. Jia
  • Published 14 December 2020
  • Computer Science
  • 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Many learning-based approaches have difficulty scaling to unseen data, as the generality of its learned prior is limited to the scale and variations of the training samples. This holds particularly true with 3D learning tasks, given the sparsity of 3D datasets available. We introduce a new learning framework for 3D modeling and reconstruction that greatly improves the generalization ability of a deep generator. Our approach strives to connect the good ends of both learning-based and… 
Mending Neural Implicit Modeling for 3D Vehicle Reconstruction in the Wild
TLDR
This work demonstrates high-quality in-the-wild shape reconstruction using a deep encoder as a robust-initializer of the shape latent-code, a deep discriminator as a learned high-dimensional shape prior, and a novel curriculum learning strategy that allows the model to learn shape priors on synthetic data and smoothly transfer them to sparse real world data.
FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction
TLDR
FvOR is a learning-based object reconstruction method that predicts accurate 3D models given a few images with noisy input poses using learnable neural network modules and achieves best-in-class results.
SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks
TLDR
This work proposes to learn implicit surface reconstruction by sign-agnostic optimization of convolutional occupancy networks, to simultaneously achieve advanced scalability to large-scale scenes, generality to novel shapes, and applicability to raw scans in a unified framework.
Sign-Agnostic CONet: Learning Implicit Surface Reconstructions by Sign-Agnostic Optimization of Convolutional Occupancy Networks
TLDR
This paper proposes to learn implicit surface reconstruction by sign-agnostic optimization of convolutional occupancy networks, to simultaneously achieve advanced scalability, generality, and applicability in a unified framework and shows this goal can be effectively achieved by a simple yet effective design.
Surface Reconstruction from Point Clouds by Learning Predictive Context Priors
TLDR
Predictive Context Priors is introduced by learning Predictive Queries for each specific point cloud at inference time, and the query prediction enables the learned local context prior over the entire prior space, rather than being restricted to the query locations, and this improves the generalizability.
Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction
TLDR
This work presents a novel pipeline to learn a temporal evolution of the 3D human shape through spatially continuous transformation functions among cross-frame occupancy fields via explicitly learning continuous displacement vector fields from robust spatio-temporal shape representations.
HM3D-ABO: A Photo-realistic Dataset for Object-centric Multi-view 3D Reconstruction
TLDR
This report presents a photo-realistic object-centric dataset HM3D-ABO, constructed by composing realistic indoor scene and realistic object and providing multi-view RGB observa-tions, a water-tight mesh model for the object, ground truth depth map and object mask.
Task-Generic Hierarchical Human Motion Prior using VAEs
TLDR
This paper proposes a hierarchical motion variational autoencoder (HM-VAE) that consists of a 2-level hierarchical latent space that can fix corrupted human body animations and generate complete movements from incomplete observations and demonstrates the effectiveness of the model in a variety of tasks.
Surface Reconstruction from Point Clouds: A Survey and a Benchmark
TLDR
The present paper contributes a large-scale benchmarking dataset consisting of both synthetic and real-scanned data, and conducts thorough empirical studies by comparing existing methods on the constructed benchmark, and paying special attention on robustness of existing methods against various scanning imperfections.
Representing 3D Shapes with Probabilistic Directed Distance Fields
TLDR
This work aims to address both shortcomings with a novel shape representation that allows fast differentiable rendering within an implicit architecture, and shows strong performance with simple architectural components via the versatility of the representation.
...
...

References

SHOWING 1-10 OF 43 REFERENCES
Occupancy Networks: Learning 3D Reconstruction in Function Space
TLDR
This paper proposes Occupancy Networks, a new representation for learning-based 3D reconstruction methods that encodes a description of the 3D output at infinite resolution without excessive memory footprint, and validate that the representation can efficiently encode 3D structure and can be inferred from various kinds of input.
Learning to Infer Implicit Surfaces without 3D Supervision
TLDR
A novel ray-based field probing technique for efficient image-to-field supervision, as well as a general geometric regularizer for implicit surfaces, which provides natural shape priors in unconstrained regions are proposed.
3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction
TLDR
The 3D-R2N2 reconstruction framework outperforms the state-of-the-art methods for single view reconstruction, and enables the 3D reconstruction of objects in situations when traditional SFM/SLAM methods fail (because of lack of texture and/or wide baseline).
Learning Representations and Generative Models for 3D Point Clouds
TLDR
A deep AutoEncoder network with state-of-the-art reconstruction quality and generalization ability is introduced with results that outperform existing methods on 3D recognition tasks and enable shape editing via simple algebraic manipulations.
Learning a Multi-View Stereo Machine
TLDR
End-to-end learning allows us to jointly reason about shape priors while conforming geometric constraints, enabling reconstruction from much fewer images than required by classical approaches as well as completion of unseen surfaces.
A Skeleton-Bridged Deep Learning Approach for Generating Meshes of Complex Topologies From Single RGB Images
TLDR
Qualitative and quantitative results on representative object categories of both simple and complex topologies demonstrate the superiority of the proposed skeleton-bridged, stage-wise learning approach over existing methods.
Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision
TLDR
This work proposes a differentiable rendering formulation for implicit shape and texture representations, showing that depth gradients can be derived analytically using the concept of implicit differentiation, and finds that this method can be used for multi-view 3D reconstruction, directly resulting in watertight meshes.
SkeletonNet: A Topology-Preserving Solution for Learning Mesh Reconstruction of Object Surfaces from RGB Images
TLDR
The novel SkeletonNet design that learns a volumetric representation of skeleton via a bridged learning of skeletal point set is proposed, and two models are proposed which improve over the existing frameworks of explicit mesh deformation and implicit eld learning for the surface reconstruction task.
Deep Geometric Prior for Surface Reconstruction
TLDR
This work proposes the use of a deep neural network as a geometric prior for surface reconstruction, and overfit a neural network representing a local chart parameterization to part of an input point cloud using the Wasserstein distance as a measure of approximation.
Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation
TLDR
This model learns to predict series of deformations to improve a coarse shape iteratively and exhibits generalization capability across different semantic categories, number of input images, and quality of mesh initialization.
...
...