Corpus ID: 235458006

Learning to Predict Visual Attributes in the Wild

  title={Learning to Predict Visual Attributes in the Wild},
  author={Khoi Pham and Kushal Kafle and Zhe Lin and Zhi Ding and Scott D. Cohen and Quan Tran and Abhinav Shrivastava},
Visual attributes constitute a large portion of information contained in a scene. Objects can be described using a wide variety of attributes which portray their visual appearance (color, texture), geometry (shape, size, posture), and other intrinsic properties (state, action). Existing work is mostly limited to study of attribute prediction in specific domains. In this paper, we introduce a large-scale in-thewild visual attribute prediction dataset consisting of over 927K attribute annotations… Expand


Deep Relative Attributes
This work introduces a deep neural network architecture for the task of relative attribute prediction using a convolutional neural network to learn the features by including an additional layer (ranking layer) that learns to rank the images based on these features. Expand
Deep Imbalanced Attribute Classification using Visual Attention Aggregation
This work introduces a loss function to handle class imbalance both at class and at an instance level and demonstrates that penalizing attention masks with high prediction variance accounts for the weak supervision of the attention mechanism. Expand
Relative attributes
This work proposes a generative model over the joint space of attribute ranking outputs, and proposes a novel form of zero-shot learning in which the supervisor relates the unseen object category to previously seen objects via attributes (for example, ‘bears are furrier than giraffes’). Expand
Improving Pedestrian Attribute Recognition With Weakly-Supervised Multi-Scale Attribute-Specific Localization
A flexible Attribute Localization Module (ALM) is proposed to adaptively discover the most discriminative regions and learns the regional features for each attribute at multiple levels and a feature pyramid architecture is introduced to enhance the attribute-specific localization at low-levels with high-level semantic guidance. Expand
The iMaterialist Fashion Attribute Dataset
This work contributes to the community a new dataset called iMaterialist Fashion Attribute (iFashion-Attribute), constructed from over one million fashion images with a label space that includes 8 groups of 228 fine-grained attributes in total, which is the first known million-scale multi-label and fine- grained image dataset. Expand
Learning Visual Attributes
It is shown that attributes can be learnt starting from a text query to Google image search, and can then be used to recognize the attribute and determine its spatial extent in novel real-world images. Expand
LVIS: A Dataset for Large Vocabulary Instance Segmentation
This work introduces LVIS (pronounced ‘el-vis’): a new dataset for Large Vocabulary Instance Segmentation, which has a long tail of categories with few training samples due to the Zipfian distribution of categories in natural images. Expand
Task-Aware Attention Model for Clothing Attribute Prediction
The performance of attribute prediction demonstrates the superiority of the proposed task-aware attention mechanism over several state-of-the-art methods both in shop and street domains. Expand
Human Attribute Recognition by Deep Hierarchical Contexts
This work trains a Convolutional Neural Network to select the most attribute-descriptive human parts from all poselet detections, and combines them with the whole body as a pose-normalized deep representation, which surpasses competitive baselines on this dataset and other popular ones. Expand
Recovering the Missing Link: Predicting Class-Attribute Associations for Unsupervised Zero-Shot Learning
This work proposes an approach to learn relations that couples class embeddings with their corresponding attributes, given only the name of an unseen class, which outperforms state-of the-art methods in both predicting class-attribute associations and unsupervised ZSL by a large margin. Expand