Learning Co-segmentation by Segment Swapping for Retrieval and Discovery

  title={Learning Co-segmentation by Segment Swapping for Retrieval and Discovery},
  author={XI Shen and Alexei A. Efros and Armand Joulin and Mathieu Aubry},
  journal={2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  • XI ShenAlexei A. Efros Mathieu Aubry
  • Published 29 October 2021
  • Computer Science
  • 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
The goal of this work is to efficiently identify visually similar patterns in images, e.g. identifying an artwork detail copied between an engraving and an oil painting, or recognizing parts a night-time photograph visible in its daytime counterpart. Lack of training data is a key challenge for this co-segmentation task. We present a simple yet surprisingly effective approach to overcome this difficulty: we generate synthetic training pairs by selecting segments in an image and copy-pasting… 

Figures and Tables from this paper

TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut

A graph-based algorithm that uses the features obtained by a self-supervised transformer to detect and segment salient objects in images and videos and achieves state-of-the-art results on several common image and video detection and segmentation tasks.

Self-Supervised Transformers for Unsupervised Object Discovery using Normalized Cut

A graph-based method that uses the selfsupervised transformer features to discover an object from an image using spectral clustering with generalized eigen-decomposition and showing that the second smallest eigenvector provides a cutting solution since its absolute value indicates the likelihood that a token belongs to a foreground object.



Deep Object Co-Segmentation

This work presents a deep object co-segmentation (DOCS) approach for segmenting common objects of the same class within a pair of images that learns to ignore common, or uncommon, background stuff and focuses on objects.

Discovering Visual Patterns in Art Collections With Spatially-Consistent Feature Learning

The key technical insight is to adapt a standard deep feature to this task by fine-tuning it on the specific art collection using self-supervised learning, and spatial consistency between neighbouring feature matches is used as supervisory fine- Tuning signal.

COTR: Correspondence Transformer for Matching Across Images

A novel framework for finding correspondences in images based on a deep neural network that, given two images and a query point in one of them, finds its correspondence in the other, yielding a multiscale pipeline able to provide highly-accurate correspondences.

NetVLAD: CNN Architecture for Weakly Supervised Place Recognition

A convolutional neural network architecture that is trainable in an end-to-end manner directly for the place recognition task and an efficient training procedure which can be applied on very large-scale weakly labelled tasks are developed.

Toward unsupervised, multi-object discovery in large-scale image collections

A novel saliency-based region proposal algorithm is proposed that achieves significantly higher overlap with ground-truth objects than other competitive methods and exploits the inherent hierarchical structure of proposals as an effective regularizer for the approach to object discovery.

Localizing Objects with Self-Supervised Transformers and no Labels

This work proposes a simple approach to object discovery, that leverages the activation features of a vision transformer pre-trained in a self-supervised manner, that outperform state-of-the-art object discovery methods by up to 8 CorLoc points on PASCAL VOC 2012.

Show, Match and Segment: Joint Weakly Supervised Learning of Semantic Matching and Object Co-Segmentation

The model is end-to-end trainable and does not require supervision from manually annotated correspondences and object masks, and performs favorably against the state-of-the-art methods on both semantic matching and object co-segmentation tasks.

Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation

A systematic study of the Copy-Paste augmentation for instance segmentation where the authors randomly paste objects onto an image finds that the simple mechanism of pasting objects randomly is good enough and can provide solid gains on top of strong baselines.

Convolutional Neural Network Architecture for Geometric Matching

This work proposes a convolutional neural network architecture for geometric matching based on three main components that mimic the standard steps of feature extraction, matching and simultaneous inlier detection and model parameter estimation, while being trainable end-to-end.

RANSAC-Flow: generic two-stage image alignment

This paper considers the generic problem of dense alignment between two images and proposes a two-stage process: first, a feature-based parametric coarse alignment using one or more homographies, followed by non-parametric fine pixel-wise alignment.