• Corpus ID: 235417600

Object Segmentation Without Labels with Large-Scale Generative Models

  title={Object Segmentation Without Labels with Large-Scale Generative Models},
  author={Andrey Voynov and Stanislav Morozov and Artem Babenko},
The recent rise of unsupervised and self-supervised learning has dramatically reduced the dependency on labeled data, providing effective image representations for transfer to downstream vision tasks. Furthermore, recent works employed these representations in a fully unsupervised setup for image classification, reducing the need for human labels on the fine-tuning stage as well. This work demonstrates that large-scale unsupervised models can also perform a more challenging object segmentation… 

Figures and Tables from this paper

Deep Spectral Methods: A Surprisingly Strong Baseline for Unsupervised Semantic Segmentation and Localization

Experiments on complex datasets demonstrate that the simple spectral method outperforms the state-of-the-art in unsupervised localization and segmentation by a significant margin, and can be readily used for a variety of complex image editing tasks, such as background removal and compositing.

TokenCut: Segmenting Objects in Images and Videos with Self-supervised Transformer and Normalized Cut

A graph-based algorithm that uses the features obtained by a self-supervised transformer to detect and segment salient objects in images and videos and achieves state-of-the-art results on several common image and video detection and segmentation tasks.

Refine and Represent: Region-to-Object Representation Learning

After pretraining on ImageNet, R2O pretrained models are able to surpass existing state-of-the-art in unsupervised object segmentation on the Caltech-UCSD Birds 200-2011 dataset without any further training.

FurryGAN: High Quality Foreground-aware Image Synthesis

FurryGAN produces realistic images with remarkably detailed alpha masks which cover hair, fur, and whiskers in a fully unsupervised manner.

EAA-Net: Rethinking the Autoencoder Architecture with Intra-class Features for Medical Image Segmentation

This paper proposes a light-weight end-to-end segmentation framework based on multi-task learning, termed Edge Attention autoencoder Network (EAA-Net), to improve edge segmentation ability.

Interpreting Latent Spaces of Generative Models for Medical Images using Unsupervised Methods

The results show that unsupervised methods to discover interpretable directions in GANs generalize to VAEs and can be applied to medical images, which opens a wide array of future work using these methods in medical image analysis.

Multi-Source Uncertainty Mining for Deep Unsupervised Saliency Detection

An Uncertainty Mining Network (UMNet) which consists of multiple Merge-and-Split modules to recursively analyze the commonality and difference among multiple noisy labels and infer pixel-wise uncertainty map for each label can adaptively select reliable labels for SOD network learning.

Guess What Moves: Unsupervised Video and Image Segmentation by Anticipating Motion

This work proposes to supervise an image segmentation network, tasking it with predicting regions that are likely to contain simple motion patterns, and thus likely to correspond to objects, and applies this network in two modes.

Unsupervised Salient Object Detection with Spectral Cluster Voting

This paper proposes a simple but effective winner-takes-all voting mechanism for selecting the salient masks, leveraging object priors based on framing and distinctiveness and trains a salient object detector, termed SELF-MASK, which outperforms prior approaches on three unsupervised SOD benchmarks.

Self-Supervised Transformers for Unsupervised Object Discovery using Normalized Cut

A graph-based method that uses the selfsupervised transformer features to discover an object from an image using spectral clustering with generalized eigen-decomposition and showing that the second smallest eigenvector provides a cutting solution since its absolute value indicates the likelihood that a token belongs to a foreground object.



Unsupervised Discovery of Interpretable Directions in the GAN Latent Space

This paper introduces an unsupervised method to identify interpretable directions in the latent space of a pretrained GAN model by a simple model-agnostic procedure, and finds directions corresponding to sensible semantic manipulations without any form of (self-)supervision.

Unsupervised Object Segmentation by Redrawing

ReDO is presented, a new model able to extract objects from images without any annotation in an unsupervised way based on the idea that it should be possible to change the textures or colors of the objects without changing the overall distribution of the dataset.

Learning to Detect Salient Objects with Image-Level Supervision

This paper develops a weakly supervised learning method for saliency detection using image-level tags only, which outperforms unsupervised ones with a large margin, and achieves comparable or even superior performance than fully supervised counterparts.

Saliency Detection via Graph-Based Manifold Ranking

This work considers both foreground and background cues in a different way and ranks the similarity of the image elements with foreground cues or background cues via graph-based manifold ranking, defined based on their relevances to the given seeds or queries.

OneGAN: Simultaneous Unsupervised Learning of Conditional Image Generation, Foreground Segmentation, and Fine-Grained Clustering

We present a method for simultaneously learning, in an unsupervised manner, (i) a conditional image generator, (ii) foreground extraction and segmentation, (iii) clustering into a two-level class

DeepUSPS: Deep Robust Unsupervised Saliency Prediction With Self-Supervision

This work proposes a two-stage mechanism for robust unsupervised object saliency prediction, where the first stage involves refinement of the noisy pseudo labels generated from different handcrafted methods, and shows that this self-learning procedure outperforms all the existing unsuper supervised methods over different datasets.

Emergence of Object Segmentation in Perturbed Generative Models

We introduce a novel framework to build a model that can learn how to segment objects from a collection of images without any human annotation. Our method builds on the observation that the location

Large Scale GAN Training for High Fidelity Natural Image Synthesis

It is found that applying orthogonal regularization to the generator renders it amenable to a simple "truncation trick," allowing fine control over the trade-off between sample fidelity and variety by reducing the variance of the Generator's input.

Hierarchical Image Saliency Detection on Extended CSSD

This work proposes a multi-layer approach and constructs an extended Complex Scene Saliency Dataset (ECSSD) to include complex but general natural images and improves detection quality on many images that cannot be handled well traditionally.

Few-Cost Salient Object Detection with Adversarial-Paced Learning

This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only, thus dramatically alleviating human labor in training models and proposes an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.