Corpus ID: 233423159

Contrastive Spatial Reasoning on Multi-View Line Drawings

@article{Xiang2021ContrastiveSR,
  title={Contrastive Spatial Reasoning on Multi-View Line Drawings},
  author={Siyuan Xiang and Anbang Yang and Yanfei Xue and Yaoqing Yang and Chen Feng},
  journal={ArXiv},
  year={2021},
  volume={abs/2104.13433}
}
Spatial reasoning on multi-view line drawings by stateof-the-art supervised deep networks is recently shown with puzzling low performances on the SPARE3D dataset. To study the reason behind the low performance and to further our understandings of these tasks, we design controlled experiments on both input data and network designs. Guided by the hindsight from these experiment results, we propose a simple contrastive learning approach along with other network modifications to improve the… Expand

References

SHOWING 1-10 OF 44 REFERENCES
Few-shot Visual Reasoning with Meta-analogical Contrastive Learning
TLDR
This work meta-learns its analogical contrastive learning model over the same tasks with diverse attributes, and shows that it generalizes to the same visual reasoning problem with unseen attributes. Expand
RAVEN: A Dataset for Relational and Analogical Visual REasoNing
TLDR
This work proposes a new dataset, built in the context of Raven's Progressive Matrices (RPM) and aimed at lifting machine intelligence by associating vision with structural, relational, and analogical reasoning in a hierarchical representation and establishes a semantic link between vision and reasoning by providing structure representation. Expand
Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles
TLDR
A novel unsupervised learning approach to build features suitable for object detection and classification and to facilitate the transfer of features to other tasks, the context-free network (CFN), a siamese-ennead convolutional neural network is introduced. Expand
Digging Into Self-Supervised Monocular Depth Estimation
TLDR
It is shown that a surprisingly simple model, and associated design choices, lead to superior predictions, and together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods. Expand
Iterative Reorganization With Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning
  • Chen Wei, Lingxi Xie, +5 authors A. Yuille
  • Computer Science
  • 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2019
TLDR
This paper presents a novel approach which applies to jigsaw puzzles with an arbitrary grid size and dimensionality and provides a fundamental and generalized principle, that weaker cues are easier to be learned in an unsupervised manner and also transfer better. Expand
Learning Image Representations by Completing Damaged Jigsaw Puzzles
TLDR
It is demonstrated that complicating the self-supervised tasks improves their original versions and that the final task learns more robust and transferable representations compared to the previous methods, as well as the simple combination of the candidate tasks. Expand
Unsupervised Representation Learning by Predicting Image Rotations
TLDR
This work proposes to learn image features by training ConvNets to recognize the 2d rotation that is applied to the image that it gets as input, and demonstrates both qualitatively and quantitatively that this apparently simple task actually provides a very powerful supervisory signal for semantic feature learning. Expand
Unsupervised Visual Representation Learning by Context Prediction
TLDR
It is demonstrated that the feature representation learned using this within-image context indeed captures visual similarity across images and allows us to perform unsupervised visual discovery of objects like cats, people, and even birds from the Pascal VOC 2011 detection dataset. Expand
Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation
TLDR
A self-supervised equivariant attention mechanism (SEAM) to discover additional supervision and narrow the gap between full and weak supervisions, and a pixel correlation module (PCM), which exploits context appearance information and refines the prediction of current pixel by its similar neighbors, leading to further improvement on CAMs consistency. Expand
Very Deep Convolutional Networks for Large-Scale Image Recognition
TLDR
This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. Expand
...
1
2
3
4
5
...