Learning Canonical View Representation for 3D Shape Recognition with Arbitrary Views

  title={Learning Canonical View Representation for 3D Shape Recognition with Arbitrary Views},
  author={Xin Wei and Yifei Gong and Fudong Wang and Xing Sun and Jian Sun},
  journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
  • Xin Wei, Yifei Gong, Jian Sun
  • Published 16 August 2021
  • Computer Science
  • 2021 IEEE/CVF International Conference on Computer Vision (ICCV)
In this paper, we focus on recognizing 3D shapes from arbitrary views, i.e., arbitrary numbers and positions of viewpoints. It is a challenging and realistic setting for view-based 3D shape recognition. We propose a canonical view representation to tackle this challenge. We first transform the original features of arbitrary views to a fixed number of view features, dubbed canonical view representation, by aligning the arbitrary view features to a set of learnable reference view features using… 

Figures and Tables from this paper


Multi-view Convolutional Neural Networks for 3D Shape Recognition
This work presents a standard CNN architecture trained to recognize the shapes' rendered views independently of each other, and shows that a 3D shape can be recognized even from a single view at an accuracy far higher than using state-of-the-art3D shape descriptors.
View-GCN: View-Based Graph Convolutional Network for 3D Shape Analysis
  • Xin Wei, Ruixuan Yu, Jian Sun
  • Computer Science
    2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2020
A novel view-based Graph Convolutional Neural Network, dubbed as view-GCN, to recognize 3D shape based on graph representation of multiple views in flexible view configurations, which is a hierarchical network based on local and non-local graph convolution for feature transform, and selective view-sampling for graph coarsening.
3D ShapeNets: A deep representation for volumetric shapes
This work proposes to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network, and shows that this 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition
Experimental results and comparison with state-of-the-art methods show that the proposed GVCNN method can achieve a significant performance gain on both the 3D shape classification and retrieval tasks.
Learning Attentive and Hierarchical Representations for 3D Shape Recognition
Experimental results clearly show that Hyperbolic Embedded Attentive Representation outperforms the state-of-the-art approaches on three 3D shape recognition tasks including generic3D shape retrieval, 3D Shape classification and sketch-based 3Dshape retrieval.
View N-Gram Network for 3D Object Retrieval
Inspired by n-gram models in natural language processing, VNN divides the view sequence into a set of visual n- grams, which involve overlapping consecutive view sub-sequences, which helps to learn a discriminative global embedding for each 3D object.
Recognizing Objects From Any View With Object and Viewer-Centered Representations
This paper proposes a computational framework by designing object and viewer-centered neural networks (OVCNet) to recognize an object instance viewed from an arbitrary unknown angle and gives rise to a viable and practical computing framework that combines both viewpoint-dependent and viewpoint-independent features for object recognition from any view.
Equivariant Multi-View Networks
A group convolutional approach to multiple view aggregation where convolutions are performed over a discrete subgroup of the rotation group, enabling joint reasoning over all views in an equivariant (instead of invariant) fashion, up to the very last layer.
DeepCCFV: Camera Constraint-Free Multi-View Convolutional Neural Network for 3D Object Retrieval
By reducing the over-fitting issue, a camera constraint-free multi-view convolutional neural network named DeepCCFV is constructed and the effectiveness of the proposed method in free camera settings comparing with existing state-of-theart 3D object retrieval methods is demonstrated.
Dominant Set Clustering and Pooling for Multi-View 3D Object Recognition
A recurrent clustering and pooling module that boosts performance for multi-view 3D object recognition, achieving a new state of the art test set recognition accuracy of 93.8% on the ModelNet 40 database.