• Publications
  • Influence
ShapeNet: An Information-Rich 3D Model Repository
TLDR
ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy, a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations.
ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes
TLDR
This work introduces ScanNet, an RGB-D video dataset containing 2.5M views in 1513 scenes annotated with 3D camera poses, surface reconstructions, and semantic segmentations, and shows that using this data helps achieve state-of-the-art performance on several 3D scene understanding tasks.
The Princeton Shape Benchmark
TLDR
It is concluded that no single descriptor is best for all classifications, and thus the main contribution of this paper is to provide a framework to determine the conditions under which each descriptor performs best.
Semantic Scene Completion from a Single Depth Image
TLDR
The semantic scene completion network (SSCNet) is introduced, an end-to-end 3D convolutional network that takes a single depth image as input and simultaneously outputs occupancy and semantic labels for all voxels in the camera view frustum.
Rotation Invariant Spherical Harmonic Representation of 3D Shape Descriptors
TLDR
The limitations of canonical alignment are described and an alternate method, based on spherical harmonics, for obtaining rotation invariant representations is discussed, which reduces the dimensionality of the descriptor, providing a more compact representation, which in turn makes comparing two models more efficient.
Matterport3D: Learning from RGB-D Data in Indoor Environments
TLDR
Matterport3D is introduced, a large-scale RGB-D dataset containing 10,800 panoramic views from 194,400RGB-D images of 90 building-scale scenes that enable a variety of supervised and self-supervised computer vision tasks, including keypoint matching, view overlap prediction, normal prediction from color, semantic segmentation, and region classification.
Dilated Residual Networks
TLDR
It is shown that dilated residual networks (DRNs) outperform their non-dilated counterparts in image classification without increasing the models depth or complexity and the accuracy advantage of DRNs is further magnified in downstream applications such as object localization and semantic segmentation.
A benchmark for 3D mesh segmentation
TLDR
The results suggest that people are remarkably consistent in the way that they segment most 3D surface meshes, that no one automatic segmentation algorithm is better than the others for all types of objects, and that algorithms based on non-local shape features seem to produce segmentations that most closely resemble ones made by humans.
Shape distributions
TLDR
The dissimilarities between sampled distributions of simple shape functions provide a robust method for discriminating between classes of objects in a moderately sized database, despite the presence of arbitrary translations, rotations, scales, mirrors, tessellations, simplifications, and model degeneracies.
A search engine for 3D models
TLDR
A new matching algorithm is developed that uses spherical harmonics to compute discriminating similarity measures without requiring repair of model degeneracies or alignment of orientations and provides 46 to 245% better performance than related shape-matching methods during precision--recall experiments.
...
1
2
3
4
5
...