Learning Similarity Conditions Without Explicit Supervision

@article{Tan2019LearningSC,
  title={Learning Similarity Conditions Without Explicit Supervision},
  author={Reuben Tan and Mariya I. Vasileva and Kate Saenko and Bryan A. Plummer},
  journal={2019 IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2019},
  pages={10372-10381}
}
Many real-world tasks require models to compare images along multiple similarity conditions (e.g. similarity in color, category or shape). Existing methods often reason about these complex similarity relationships by learning condition-aware embeddings. While such embeddings aid models in learning different notions of similarity, they also limit their capability to generalize to unseen categories since they require explicit labels at test time. To address this deficiency, we propose an approach… Expand
Semi-Supervised Visual Representation Learning for Fashion Compatibility
TLDR
This work proposes a semi-supervised learning approach where large unlabeled fashion corpus is leveraged to create pseudo positive and negative outfits on the fly during training to implicitly incorporate colour and other important attributes through self-supervision. Expand
Flexible Few-Shot Learning with Contextual Similarity
TLDR
This work proposes to build upon recent contrastive unsupervised learning techniques and use a combination of instance and class invariance learning, aiming to obtain general and flexible features, and finds that this approach performs strongly on new flexible few-shot learning benchmarks. Expand
Improving Deep Metric Learning by Divide and Conquer
TLDR
This work significantly improves upon the state-of-the-art in image retrieval and clustering on CUB200-2011, CARS196, SOP, In-shop Clothes, and VehicleID datasets by jointly splitting the embedding space and the data hierarchically into smaller sub-parts. Expand
Effectively Leveraging Attributes for Visual Similarity
TLDR
The Pairwise Attribute-informed similarity Network (PAN), which breaks similarity learning into capturing similarity conditions and relevance scores from a joint representation of two images, enables the model to identify that two images contain the same attribute, but can have it deemed irrelevant and ignored for measuring similarity between the two images. Expand
Target-Oriented Deformation of Visual-Semantic Embedding Space
TLDR
Qualitative analysis reveals that TOD-Net successfully emphasizes entity-specific concepts and retrieves diverse targets via handling higher levels of diversity than existing models and gains the state-of-the-art cross-modal retrieval model associated with the MSCOCO dataset. Expand
Why do These Match? Explaining the Behavior of Image Similarity Models
TLDR
Salient Attributes for Network Explanation is introduced to explain image similarity models, where a model's output is a score measuring the similarity of two inputs rather than a classification score, and can also improve performance on the classic task of attribute recognition. Expand
Fashion Compatibility Recommendation via Unsupervised Metric Graph Learning
In the task of fashion compatibility prediction, the goal is to pick an item from a candidate list to complement a partial outfit in the most appealing manner. Existing fashion compatibilityExpand
Contextualizing Multiple Tasks via Learning to Decompose
TLDR
This work proposes a general approach Learning to Decompose Network (LEADNET) for both two cases, which contextualizes a model through meta-learning multiple maps for concepts discovery — the representations of instances are decomposed and adapted conditioned on the contexts. Expand
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
TLDR
A new vision-language (VL) pre-training model dubbed Kaleido-BERT is presented, which introduces a novel kaleido strategy for fashion cross-modality representations from transformers, and design alignment guided masking to jointly focus more on image-text semantic relations. Expand
Learning Color Compatibility in Fashion Outfits
TLDR
A novel way to model outfit compatibility and an innovative learning scheme is presented that combines a novel graph construction to better utilize the power of graph neural networks and sets the new state-of-the-art in fashion compatibility prediction. Expand
...
1
2
3
...

References

SHOWING 1-10 OF 45 REFERENCES
Conditional Similarity Networks
TLDR
This work proposes Conditional Similarity Networks (CSNs) that learn embeddings differentiated into semantically distinct subspaces that capture the different notions of similarities. Expand
Learning Type-Aware Embeddings for Fashion Compatibility
TLDR
This paper presents an approach to learning an image embedding that respects item type, and jointly learns notions of item similarity and compatibility in an end-to-end model. Expand
Give Me a Hint! Navigating Image Databases Using Human-in-the-Loop Feedback
TLDR
An attribute-based interactive image search which can leverage human-in-the-loop feedback to iteratively refine image search results is introduced and the recently introduced Conditional Similarity Network is extended to incorporate global similarity in training visual embeddings, which results in more natural transitions as the user explores the learned similarityembeddings. Expand
Context-Aware Visual Compatibility Prediction
TLDR
This work proposes a method that predicts compatibility between two items based on their visual features, as well as their context, using a graph neural network that learns to generate product embeddings conditioned on their context. Expand
Fine-Grained Visual Comparisons with Local Learning
  • A. Yu, K. Grauman
  • Computer Science
  • 2014 IEEE Conference on Computer Vision and Pattern Recognition
  • 2014
TLDR
This work proposes a local learning approach for fine-grained visual comparisons that outperforms state-of-the-art methods for relative attribute prediction and shows how to identify analogous pairs using learned metrics. Expand
Semantic Jitter: Dense Supervision for Visual Comparisons via Synthetic Images
  • A. Yu, K. Grauman
  • Computer Science
  • 2017 IEEE International Conference on Computer Vision (ICCV)
  • 2017
TLDR
This work proposes to overcome the sparsity of supervision problem via synthetically generated images by bootstrapping imperfect image generators to counteract sample sparsity for learning to rank. Expand
Conditional Image-Text Embedding Networks
TLDR
This paper proposes a concept weight branch that automatically assigns phrases to embeddings, whereas prior works predefine such assignments, which simplifies the representation requirements for individual embeds and allows the underrepresented concepts to take advantage of the shared representations before feeding them into concept-specific layers. Expand
Learning Visual Clothing Style with Heterogeneous Dyadic Co-Occurrences
With the rapid proliferation of smart mobile devices, users now take millions of photos every day. These include large numbers of clothing and accessory images. We would like to answer questions likeExpand
Cross-Domain Image Retrieval with a Dual Attribute-Aware Ranking Network
TLDR
This work proposes a Dual Attribute-aware Ranking Network (DARN) for retrieval feature learning, consisting of two sub-networks, one for each domain, whose retrieval feature representations are driven by semantic attribute learning. Expand
Learning Two-Branch Neural Networks for Image-Text Matching Tasks
TLDR
This paper investigates two-branch neural networks for learning the similarity between image-sentence matching and region-phrase matching, and proposes two network structures that produce different output representations. Expand
...
1
2
3
4
5
...