Semi-Supervised Learning for Few-Shot Image-to-Image Translation

  title={Semi-Supervised Learning for Few-Shot Image-to-Image Translation},
  author={Yaxing Wang and Salman Hameed Khan and Abel Gonzalez-Garcia and Joost van de Weijer and Fahad Shahbaz Khan},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Yaxing WangS. Khan F. Khan
  • Published 30 March 2020
  • Computer Science
  • 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
In the last few years, unpaired image-to-image translation has witnessed Remarkable progress. Although the latest methods are able to generate realistic images, they crucially rely on a large number of labeled images. Recently, some methods have tackled the challenging setting of few-shot image-to-image ranslation, reducing the labeled data requirements for the target domain during inference. In this work, we go one step further and reduce the amount of required labeled data also from the… 

Rethinking the Truly Unsupervised Image-to-Image Translation

A truly unsupervised image-to-image translation model (TUNIT) that simultaneously learns to separate image domains and translates input images into the estimated domains and is robust against the choice of hyperparameters is proposed.

ManiFest: Manifold Deformation for Few-shot Image Translation

ManiFest is a framework for few-shot image translation that learns a context-aware representation of a target domain from a few images only, and can alternatively be conditioned on a single exemplar image to reproduce its specific style.

LANIT: Language-Driven Image-to-Image Translation for Unlabeled Data

A LANguage-driven Image-to-image Translation model, dubbed LANIT, that achieves comparable or superior performance to existing models and introduces a slack domain to cover samples that are not covered by the candidate domains.

Leveraging Local Domains for Image-to-Image Translation

This paper leverages human knowledge about spatial domain characteristics which it refers to as ’local domains’ and demonstrates its benevolence for image-to-image translation and shows that all tested proxy tasks are significantly improved, without ever seeing target domain at training.

DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs

A novel deep hierarchical Image-to-Image Translation method, called DeepI2I, that transfers knowledge from pre-trained GANs and qualitatively and quantitatively demonstrates that transfer learning significantly improves the performance of I2I systems, especially for small datasets.

Attribute Group Editing for Reliable Few-shot Image Generation

A new “editing-based” method, i.e., Attribute Group Editing (AGE), for few-shot image generation, capable of not only producing more realistic and diverse images for downstream visual applications with limited data but achieving controllable image editing with interpretable category-irrelevant directions.

Few-shot Semantic Image Synthesis Using StyleGAN Prior

This paper presents a training strategy that performs pseudo labeling of semantic masks using the StyleGAN prior that can synthesize high-quality images from not only dense semantic masks but also sparse inputs such as landmarks and scribbles.

Few-Shot Model Adaptation for Customized Facial Landmark Detection, Segmentation, Stylization and Shadow Removal

The FSMA framework is prominent in its versatility across a wide range of facial image applications and achieves state-of-the-art few-shot landmark detection performance and it offers satisfying solutions for few- shot face segmentation, stylization and facial shadow removal tasks for the first time.

Local Propagation for Few-Shot Learning

This work treats local image features as independent examples, builds a graph on them and uses it to propagate both the features themselves and the labels, known and unknown, improving accuracy over corresponding methods.

Multi-Style Unsupervised Image Synthesis Using Generative Adversarial Nets

A novel Multi-Style Unsupervised Feature-Wise image synthesis model using Generative Adversarial Nets (MSU-FW-GAN) based on the MSU-GAN is proposed for the shape variation tasks.



Few-Shot Unsupervised Image-to-Image Translation

This model achieves this few-shot generation capability by coupling an adversarial training scheme with a novel network design, and verifies the effectiveness of the proposed framework through extensive experimental validation and comparisons to several baseline methods on benchmark datasets.

DualGAN: Unsupervised Dual Learning for Image-to-Image Translation

A novel dual-GAN mechanism is developed, which enables image translators to be trained from two sets of unlabeled images from two domains, and can even achieve comparable or slightly better results than conditional GAN trained on fully labeled data.

SMIT: Stochastic Multi-Label Image-to-Image Translation

This work proposes a joint framework of diversity and multi-mapping image-to-image translations, using a single generator to conditionally produce countless and unique fake images that hold the underlying characteristics of the source image.

Latent Filter Scaling for Multimodal Unsupervised Image-To-Image Translation

This work presents a simple method that produces higher quality images than current state-of-the-art while maintaining the same amount of multimodal diversity by treating the latent code as a modifier of the convolutional filters.

Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks

This work presents an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples, and introduces a cycle consistency loss to push F(G(X)) ≈ X (and vice versa).

TransGaGa: Geometry-Aware Unsupervised Image-To-Image Translation

A novel disentangle-and-translate framework to tackle the complex objects image-to-image translation task, which disentangles image space into a Cartesian product of the appearance and the geometry latent spaces and supports multimodal translation.

Multimodal Unsupervised Image-to-Image Translation

A Multimodal Unsupervised Image-to-image Translation (MUNIT) framework that assumes that the image representation can be decomposed into a content code that is domain-invariant, and a style code that captures domain-specific properties.

One-Shot Unsupervised Cross Domain Translation

This work argues that this task could be a key AI capability that underlines the ability of cognitive agents to act in the world and presents empirical evidence that the existing unsupervised domain translation methods fail on this task.

Toward Multimodal Image-to-Image Translation

This work aims to model a distribution of possible outputs in a conditional generative modeling setting that helps prevent a many-to-one mapping from the latent code to the output during training, also known as the problem of mode collapse.

Image-To-Image Translation via Group-Wise Deep Whitening-And-Coloring Transformation

An end-to-end approach tailored for image translation that efficiently approximates this transformation with the novel regularization methods is proposed and is fast, both in training and inference, and highly effective in reflecting the style of an exemplar.