Im2Struct: Recovering 3D Shape Structure from a Single RGB Image

@article{Niu2018Im2StructR3,
  title={Im2Struct: Recovering 3D Shape Structure from a Single RGB Image},
  author={Chengjie Niu and Jun Li and Kai Xu},
  journal={2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2018},
  pages={4521-4529}
}
  • Chengjie Niu, Jun Li, Kai Xu
  • Published 16 April 2018
  • Computer Science
  • 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition
We propose to recover 3D shape structures from single RGB images, where structure refers to shape parts represented by cuboids and part relations encompassing connectivity and symmetry. Given a single 2D image with an object depicted, our goal is automatically recover a cuboid structure of the object parts as well as their mutual relations. We develop a convolutional-recursive auto-encoder comprised of structure parsing of a 2D image followed by structure recovering of a cuboid hierarchy. The… 

Figures and Tables from this paper

Learning Unsupervised Hierarchical Part Decomposition of 3D Objects From a Single RGB Image
TLDR
This work proposes a novel formulation that allows to jointly recover the geometry of a 3D object as a set of primitives as well as their latent hierarchical structure without part-level supervision, and recovers the higher level structural decomposition of various objects in the form of a binary tree ofPrimitives.
Single Image 3D Object Estimation with Primitive Graph Networks
TLDR
A two-stage graph network for primitive-based 3D object estimation, which consists of a sequential proposal module and a graph reasoning module, capable of taking into account rich geometry and semantic constraints during 3D structure recovery, producing 3D objects with more coherent structure even under challenging viewing conditions.
Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images
TLDR
This work proposes a robust estimator for primitive fitting, which can meaningfully abstract real-world environments using cuboids and does not require labour-intensive labels, such as cuboid annotations, for training.
D2IM-Net: Learning Detail Disentangled Implicit Fields from Single Images
  • Manyi Li, Hao Zhang
  • Computer Science
    2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2021
TLDR
The final 3D reconstruction is a fusion between the base shape and the displacement maps, with three losses enforcing the recovery of coarse shape, overall structure, and surface details via a novel Laplacian term.
Neural Implicit 3D Shapes from Single Images with Spatial Patterns
TLDR
The key to the work is the ubiquitousness of the spatial patterns across shapes, which enables reasoning invisible parts of the underlying objects and thus greatly mitigates the occlusion issue.
Learning Single-Image 3D Reconstruction by Generative Modelling of Shape, Pose and Shading
TLDR
A unified framework tackling two problems: class-specific 3D reconstruction from a single image, and generation of new 3D shape samples, that can learn to generate and reconstruct concave object classes and supports concave classes such as bathtubs and sofas, which methods based on silhouettes cannot learn.
STD-Net: Structure-preserving and Topology-adaptive Deformation Network for 3D Reconstruction from a Single Image
TLDR
Experimental results on the images from ShapeNet show that the proposed STD-Net has better performance than other state-of-the-art methods onconstructing 3D objects with complex structures and fine geometric details.
Weakly Supervised Part‐wise 3D Shape Reconstruction from Single‐View RGB Images
TLDR
A deep neural network is learned which takes a single‐view RGB image as input, and outputs a 3D shape in parts represented by 3D point clouds with an array of 3D part generators to generate shapes with both correct part shape and reasonable overall structure.
Pix2Shape: Towards Unsupervised Learning of 3D Scenes from Images Using a View-Based Representation
TLDR
Pix2Shape learns a consistent scene representation in its encoded latent space, and that the decoder can then be applied to this latent representation in order to synthesize the scene from a novel viewpoint.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 26 REFERENCES
3D Shape Segmentation with Projective Convolutional Networks
TLDR
This paper introduces a deep architecture for segmenting 3D objects into their labeled semantic parts that significantly outperforms the existing state-of-the-art methods in the currently largest segmentation benchmark (ShapeNet).
3D Shape Reconstruction from Sketches via Multi-view Convolutional Networks
We propose a method for reconstructing 3D shapes from 2D sketches in the form of line drawings. Our method takes as input a single sketch, or multiple sketches, and outputs a dense point cloud
3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction
TLDR
The 3D-R2N2 reconstruction framework outperforms the state-of-the-art methods for single view reconstruction, and enables the 3D reconstruction of objects in situations when traditional SFM/SLAM methods fail (because of lack of texture and/or wide baseline).
GRASS: Generative Recursive Autoencoders for Shape Structures
TLDR
A novel neural network architecture for encoding and synthesis of 3D shapes, particularly their structures, is introduced and it is demonstrated that without supervision, the network learns meaningful structural hierarchies adhering to perceptual grouping principles, produces compact codes which enable applications such as shape classification and partial matching, and supports shape synthesis and interpolation with significant variations in topology and geometry.
A Point Set Generation Network for 3D Object Reconstruction from a Single Image
TLDR
This paper addresses the problem of 3D reconstruction from a single image, generating a straight-forward form of output unorthordox, and designs architecture, loss function and learning paradigm that are novel and effective, capable of predicting multiple plausible 3D point clouds from an input image.
Deeper Depth Prediction with Fully Convolutional Residual Networks
TLDR
A fully convolutional architecture, encompassing residual learning, to model the ambiguous mapping between monocular images and depth maps is proposed and a novel way to efficiently learn feature map up-sampling within the network is presented.
A Two-Streamed Network for Estimating Fine-Scaled Depth Maps from Single RGB Images
TLDR
A fast-to-train two-streamed CNN that predicts depth and depth gradients, which are then fused together into an accurate and detailed depth map, and defines a novel set loss over multiple images.
Estimating image depth using shape collections
TLDR
This paper considers the problem of adding depth to an image of an object, effectively 'lifting' it back to 3D, by exploiting a collection of aligned 3D models of related objects, and concludes that the network of shapes implicitly characterizes a shape-specific deformation subspace that regularizes the problem and enables robust diffusion of depth information from the shape collection to the input image.
Photo-inspired model-driven 3D object modeling
TLDR
An algorithm for 3D object modeling where the user draws creative inspiration from an object captured in a single photograph to create a digital 3D model as a geometric variation from a 3D candidate.
Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling
TLDR
A novel framework, namely 3D Generative Adversarial Network (3D-GAN), which generates 3D objects from a probabilistic space by leveraging recent advances in volumetric convolutional networks and generative adversarial nets, and a powerful 3D shape descriptor which has wide applications in 3D object recognition.
...
1
2
3
...