Two-Stream Multi-Task Network for Fashion Recognition

  title={Two-Stream Multi-Task Network for Fashion Recognition},
  author={Peizhao Li and Yanjing Li and Xiaolong Jiang and Xiantong Zhen},
  journal={2019 IEEE International Conference on Image Processing (ICIP)},
In this paper, we present a two-stream multi-task network for fashion recognition. [...] Key Method We design two knowledge sharing strategies which enable information transfer between tasks and improve the overall performance. The proposed model achieves state-of-the-art results on large-scale fashion dataset comparing to the existing methods, which demonstrates its great effectiveness and superiority for fashion recognition.Expand
An improved landmark-driven and spatial–channel attentive convolutional neural network for fashion clothes classification
Experimental results show that the proposed architecture involving deep neural network outperforms other recently reported state-of-the-art techniques in the classification of fashion clothes. Expand
A Brief Review of Recent Progress in Fashion Landmark Detection
  • Yungang Zhang, Cai Zhang, F. Du
  • Computer Science
  • 2019 12th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics (CISP-BMEI)
  • 2019
This paper aims at roughly categorizing these recently proposed landmark detection methods for fashion, discussing how these methods can improve detection performance and the fashion benchmark datasets used for landmark detection. Expand
Leveraging Class Hierarchy in Fashion Classification
Experimental results on large fashion datasets show that intuition, taking into account hierarchical dependencies between class labels, can help improve performance. Expand
Multiple-Clothing Detection and Fashion Landmark Estimation Using a Single-Stage Detector
A one-stage detector that rapidly detects multiple cloths and landmarks in fashion images that has the advantage of operating in low-power devices and a low number of parameters and low computational cost make it efficient. Expand
Condition-CNN: A hierarchical multi-label fashion image classification model
A novel hierarchical image classification model, Condition-CNN, is proposed, which addresses some of the shortcomings of the branching convolutional neural network in terms of training time and fine-grained accuracy. Expand
Powering Virtual Try-On via Auxiliary Human Segmentation Learning
This work proposes to use auxiliary learning to power an existing state-of-the-art virtual try-on network and leverage prediction of human semantic segmentation as an auxiliary task and shows that it allows the network to better model the bounds of the clothing item and human skin, thereby producing a better fit. Expand
Clothing Classification using Unsupervised Pre-Training
The experimental results have shown that using unsupervised pre-training can attain comparable classification accuracy on image classification comparing to fully supervised models, and it is shown that the models uses five times less labelled data during the fine-tuning phase and still achieves comparable accuracy. Expand
From Street Photos to Fashion Trends: Leveraging User-Provided Noisy Labels for Fashion Understanding
This work proposes the Fashion Attributes Recognition Network (FARNet) based on the multi-task learning framework to improve fashion recognition, and shows that this approach significantly outperforms existing methods. Expand
Multimodal Sequential Fashion Attribute Prediction
This work proposes a sequential prediction model that can learn to capture the dependencies between the different attribute values in the chain of product attributes, and shows that the sequential model outperforms two non-sequential baselines on all experimental datasets. Expand


Two-Stream Convolutional Networks for Action Recognition in Videos
This work proposes a two-stream ConvNet architecture which incorporates spatial and temporal networks and demonstrates that a ConvNet trained on multi-frame dense optical flow is able to achieve very good performance in spite of limited training data. Expand
Stacked Hourglass Networks for Human Pose Estimation
This work introduces a novel convolutional network architecture for the task of human pose estimation that is described as a “stacked hourglass” network based on the successive steps of pooling and upsampling that are done to produce a final set of predictions. Expand
Heterogeneous Multi-task Learning for Human Pose Estimation with Deep Convolutional Neural Network
A heterogeneous multi-task learning framework for human pose estimation from monocular images using a deep convolutional neural network and it is shown that including the detection tasks helps to regularize the network, directing it to converge to a good solution. Expand
Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification
A knowledge-guided fashion network to solve the problem of visual fashion analysis, e.g., fashion landmark localization and clothing category classification is proposed and Bidirectional Convolutional Recurrent Neural Networks (BCRNNs) are introduced for efficiently approaching message passing over grammar topologies, and producing regularized landmark layouts. Expand
Convolutional Two-Stream Network Fusion for Video Action Recognition
A new ConvNet architecture for spatiotemporal fusion of video snippets is proposed, and its performance on standard benchmarks where this architecture achieves state-of-the-art results is evaluated. Expand
Multi-Task CNN Model for Attribute Prediction
A joint multi-task learning algorithm to better predict attributes in images using deep convolutional neural networks (CNN) and a method to decompose the overall model's parameters into a latent task matrix and combination matrix is proposed. Expand
Fashion Landmark Detection in the Wild
Fashion landmark is compared to clothing bounding boxes and human joints in two applications, fashion attribute prediction and clothes retrieval, showing that fashion landmark is a more discriminative representation to understand fashion images. Expand
Unconstrained Fashion Landmark Detection via Hierarchical Recurrent Transformer Networks
This work presents a novel Deep LAndmark Network (DLAN), where bounding boxes and landmarks are jointly estimated and trained iteratively in an end-to-end manner, and presents a large-scale fashion landmark dataset, namely Unconstrained Landmark Database (ULD), consisting of 30K images. Expand
Leveraging Weakly Annotated Data for Fashion Image Retrieval and Label Prediction
This paper presents a method to learn a visual representation adapted for e-commerce products based on weakly supervised learning that achieves nearly state-of-art results on the DeepFashion In-Shop Clothes Retrieval and Categories Attributes Prediction tasks. Expand
DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations
This work introduces DeepFashion1, a large-scale clothes dataset with comprehensive annotations, and proposes a new deep model, namely FashionNet, which learns clothing features by jointly predicting clothing attributes and landmarks. Expand