A High Performance CRF Model for Clothes Parsing

@inproceedings{SimoSerra2014AHP,
  title={A High Performance CRF Model for Clothes Parsing},
  author={Edgar Simo-Serra and Sanja Fidler and Francesc Moreno-Noguer and Raquel Urtasun},
  booktitle={ACCV},
  year={2014}
}
In this paper we tackle the problem of clothing parsing: Our goal is to segment and classify different garments a person is wearing. We frame the problem as the one of inference in a pose-aware Conditional Random Field (CRF) which exploits appearance, figure/ground segmentation, shape and location priors for each garment as well as similarities between segments, and symmetries between different human body parts. We demonstrate the effectiveness of our approach on the Fashionista dataset [1] and… 
Looking at Outfit to Parse Clothing
TLDR
This paper extends fully-convolutional neural networks (FCN) for the clothing parsing problem with a side-branch network which is referred to outfit encoder to predict a consistent set of clothing labels to encourage combinatorial preference, and with conditional random field to explicitly consider coherent label assignment to the given image.
Finer-Net: Cascaded Human Parsing with Hierarchical Granularity
TLDR
A cascaded segmentation network with three stages with superior performance than state-of-the-arts and show great generalization ability to solve human parsing problems.
Clothes Co-Parsing Via Joint Image Segmentation and Labeling With Application to Clothing Retrieval
TLDR
An integrated system for clothing co-parsing (CCP), in order to jointly parse a set of clothing images (unsegmented but annotated with tags) into semantic configurations, consisting of two phases of inference is proposed.
Transferring pose and augmenting background for deep human-image parsing and its applications
TLDR
This paper incorporates a pose estimation network into an end-to-end human-image parsing network, in order to transfer common features across the domains, and increases the variation in image backgrounds automatically by replacing the original backgrounds of human images with others obtained from large-scale scenery image datasets.
Transferring Pose and Augmenting Background Variation for Deep Human Image Parsing
TLDR
To handle various poses, a pose estimation network is incorporated into an end-to-end human parsing network in order to transfer common features across the domains and increase the variations of background images automatically by replacing the original backgrounds of human images with those obtained from large-scale scenery image datasets.
Look into Person: Joint Body Parsing & Pose Estimation Network and a New Benchmark
TLDR
A new benchmark named “Look into Person (LIP)” is introduced that provides a significant advancement in terms of scalability, diversity, and difficulty, which are crucial for future developments in human-centric analysis.
EPYNET: Efficient Pyramidal Network for Clothing Segmentation
TLDR
A new measure of dataset imbalance, motivated by the difficulty in comparing different datasets for clothing segmentation, is introduced and can be potentially useful for many real-world applications related to soft biometrics, people surveillance, image description, clothes recommendation, and others.
Multi-label Fashion Image Classification with Minimal Human Supervision
TLDR
This work presents a new dataset of full body poses, each with a set of 66 binary labels corresponding to the information about the garments worn in the image obtained in an automatic manner, and manually correct the labels for a small subset of the data.
Fully Convolutional Network with Superpixel Parsing for Fashion Web Image Segmentation
TLDR
A new method for extracting deformable clothing items from still images by extending the output of a Fully Convolutional Neural Network to infer context from local units (superpixels) is introduced.
Transferring clothing parsing from fashion dataset to surveillance
TLDR
This paper addresses the problem of automatic clothing parsing in surveillance video with the information from user-generated tags such as “jeans” and “T-shirt” with a weakly-supervised transfer learning method.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 40 REFERENCES
Parsing Clothes in Unrestricted Images
TLDR
This work proposes a method to segment clothes in settings where there is no restriction on number and type of clothes, pose of the person, viewing angle, occlusion and number of people, and outperformed the recent attempt on H3D.
Retrieving Similar Styles to Parse Clothing
TLDR
This paper tackles the clothing parsing problem using a retrieval-based approach that combines parsing from: pre-trained global clothing models, local clothing models learned on the fly from retrieved examples, and transferred parse-masks (Paper Doll item transfer) from retrievedExamples.
Parsing clothing in fashion photographs
TLDR
An effective method for parsing clothing in fashion photographs, an extremely challenging problem due to the large number of possible garment items, variations in configuration, garment appearance, layering, and occlusion is demonstrated.
Paper Doll Parsing: Retrieving Similar Styles to Parse Clothing Items
TLDR
This paper tackles the clothing parsing problem using a retrieval based approach that combines parsing from: pre-trained global clothing models, local clothing models learned on the fly from retrieved examples, and transferred parse masks (paper doll item transfer) from retrieved example.
Describing people: A poselet-based approach to attribute classification
TLDR
This work proposes a method for recognizing attributes, such as the gender, hair style and types of clothes of people under large variation in viewpoint, pose, articulation and occlusion typical of personal photo album images, using a part-based approach based on poselets.
A Deformable Mixture Parsing Model with Parselets
TLDR
The Deformable Mixture Parsing Model (DMPM) thus directly solves the problem of human parsing by searching for the best graph configuration from a pool of Parse let hypotheses without intermediate tasks.
Poselets: Body part detectors trained using 3D human pose annotations
TLDR
A new dataset, H3D, is built of annotations of humans in 2D photographs with 3D joint information, inferred using anthropometric constraints, to address the classic problems of detection, segmentation and pose estimation of people in images with a novel definition of a part, a poselet.
Segmentation using Deformable Spatial Priors with Application to Clothing
TLDR
The method builds on an existing MRF formulation incorporating a prior shape model and colour distributions for the constituent parts and proposes a novel shape model consisting of a deformable spatial prior probability for the part-label at each pixel.
Describing Clothing by Semantic Attributes
TLDR
A fully automated system that is capable of generating a list of nameable attributes for clothes on human body in unconstrained images is proposed, and a novel application of dressing style analysis is introduced that utilizes the semantic attributes produced by the system.
Multi-level inference by relaxed dual decomposition for human pose segmentation
TLDR
This paper constructs an energy function over the pose of the human body and pixel-wise foreground / background segmentation and shows how to optimize this energy in a principled way by relaxed dual decomposition, which proceeds by maximizing a concave lower bound on the energy function.
...
1
2
3
4
...