Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models

@article{Klokov2017EscapeFC,
  title={Escape from Cells: Deep Kd-Networks for the Recognition of 3D Point Cloud Models},
  author={Roman Klokov and Victor S. Lempitsky},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  year={2017},
  pages={863-872}
}
  • Roman Klokov, V. Lempitsky
  • Published 4 April 2017
  • Computer Science
  • 2017 IEEE International Conference on Computer Vision (ICCV)
We present a new deep learning architecture (called Kdnetwork) that is designed for 3D model recognition tasks and works with unstructured point clouds. The new architecture performs multiplicative transformations and shares parameters of these transformations according to the subdivisions of the point clouds imposed onto them by kdtrees. Unlike the currently dominant convolutional architectures that usually require rasterization on uniform twodimensional or three-dimensional grids, Kd-networks… 

Figures and Tables from this paper

Multiresolution Tree Networks for 3D Point Cloud Processing
TLDR
This model represents a 3D shape as a set of locality-preserving 1D ordered list of points at multiple resolutions, which allows efficient feed-forward processing through 1D convolutions, coarse-to-fine analysis through a multi-grid architecture, and it leads to faster convergence and small memory footprint during training.
PC-Net: A Deep Network for 3D Point Clouds Analysis
TLDR
This work proposes a simple but effective approach for 3D Point Clouds analysis, named PC-Net, which directly learns on point sets and is equipped with three new operations: first, it applies a novel scale-aware neighbor search for adaptive neighborhood extracting; second, for each neighboring point, it learns a local spatial feature as a complement to their associated features; finally, at the end it uses a distance reweighted pooling to aggregate all the features from local structure.
Multiresolution Tree Networks for Point Cloud Procesing
TLDR
This model represents a 3D shape as a set of locality-preserving 1D ordered list of points at multiple resolutions, which allows efficient feed-forward processing through 1D convolutions, coarse-to-fine analysis through a multi-grid architecture, and it leads to faster convergence and small memory footprint during training.
A-CNN: Annularly Convolutional Neural Networks on Point Clouds
TLDR
A new method to define and compute convolution directly on 3D point clouds by the proposed annular convolution that can better capture the local neighborhood geometry of each point by specifying the (regular and dilated) ring-shaped structures and directions in the computation.
3D Point Cloud Classification and Segmentation using 3D Modified Fisher Vector Representation for Convolutional Neural Networks
TLDR
A novel 3D point cloud representation called 3D Modified Fisher Vectors (3DmFV) is proposed, which combines the discrete structure of a grid with continuous generalization of Fisher vectors, in a compact and computationally efficient way.
3DContextNet: K-d Tree Guided Hierarchical Learning of Point Clouds Using Local Contextual Cues
TLDR
This paper proposes a method that directly uses point clouds as input and exploits the implicit space partition of k-d tree structure to learn the local contextual information and aggregate features at different scales hierarchically.
R-Covnet: Recurrent Neural Convolution Network for 3D Object Recognition
TLDR
This paper proposes a new deep learning architecture called R-CovNet, designed for 3D object recognition, which provides a permutation invariant architecture specially designed for pointclouds data of any size.
Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds Using Convolutional Neural Networks
TLDR
The Nesti-Net method builds on a new local point cloud representation which consists of multi-scale point statistics (MuPS), estimated on a local coarse Gaussian grid, which is a suitable input to a CNN architecture.
EMBEDDING FOR 3 D DATA PROCESSING
TLDR
The architecture, named PE-Net, learns the representation of point clouds in high-dimensional space, and encodes the unordered input points to feature vectors, which standard 2D CNNs can be applied to.
...
...

References

SHOWING 1-10 OF 44 REFERENCES
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation
TLDR
This paper designs a novel type of neural network that directly consumes point clouds, which well respects the permutation invariance of points in the input and provides a unified architecture for applications ranging from object classification, part segmentation, to scene semantic parsing.
FPNN: Field Probing Neural Networks for 3D Data
TLDR
This work represents 3D spaces as volumetric fields, and proposes a novel design that employs field probing filters to efficiently extract features from them, showing that field probing is significantly more efficient than 3DCNNs, while providing state-of-the-art performance, on classification tasks for 3D object recognition benchmark datasets.
SHREC ’ 17 Track Large-Scale 3 D Shape Retrieval from ShapeNet Core 55
TLDR
Overall performance on the shape retrieval task has improved significantly compared to the iteration of this competition in SHREC 2016, and all data, results, and evaluation code are released to catalyze future research into large-scale 3D shape retrieval.
Multi-view Convolutional Neural Networks for 3D Shape Recognition
TLDR
This work presents a standard CNN architecture trained to recognize the shapes' rendered views independently of each other, and shows that a 3D shape can be recognized even from a single view at an accuracy far higher than using state-of-the-art3D shape descriptors.
3D ShapeNets: A deep representation for volumetric shapes
TLDR
This work proposes to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network, and shows that this 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
OctNet: Learning Deep 3D Representations at High Resolutions
TLDR
The utility of the OctNet representation is demonstrated by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling.
Deep Learning with Sets and Point Clouds
TLDR
This work uses deep permutation-invariant networks to perform point-could classification and MNIST-digit summation, where in both cases the output is invariant to permutations of the input.
VoxNet: A 3D Convolutional Neural Network for real-time object recognition
  • Daniel Maturana, S. Scherer
  • Computer Science
    2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2015
TLDR
VoxNet is proposed, an architecture to tackle the problem of robust object recognition by integrating a volumetric Occupancy Grid representation with a supervised 3D Convolutional Neural Network (3D CNN).
FusionNet: 3D Object Classification Using Multiple Data Representations
TLDR
New Volumetric CNN (V-CNN) architectures are introduced and exploited to learn new features, which yield a significantly better classifier than using either of the representations in isolation.
Learning class‐specific descriptors for deformable shapes using localized spectral convolutional networks
TLDR
Experimental results show that the proposed approach allows learning class‐specific shape descriptors significantly outperforming recent state‐of‐the‐art methods on standard benchmarks.
...
...