• Publications
  • Influence
Learning Locally-Adaptive Decision Functions for Person Verification
TLDR
The decision function for verification is proposed to be viewed as a joint model of a distance metric and a locally adaptive thresholding rule, and the inference on the decision function is formulated as a second-order large-margin regularization problem, and an efficient algorithm is provided in its dual from.
Large-scale image classification: Fast feature extraction and SVM training
TLDR
A parallel averaging stochastic gradient descent (ASGD) algorithm for training one-against-all 1000-class SVM classifiers and a Hadoop scheme that performs feature extraction in parallel using hundreds of mappers, which achieves state-of-the-art performance on the ImageNet 1000- class classification.
TGIF: A New Dataset and Benchmark on Animated GIF Description
TLDR
A new dataset with 100K animated GIFs from Tumblr and 120K natural language descriptions obtained via crowdsourcing is collected to develop a testbed for image sequence description systems, and it is shown that models fine-tuned from this dataset can be helpful for automatic movie description.
Spatially Coherent Latent Topic Model for Concurrent Segmentation and Classification of Objects and Scenes
TLDR
Spatial-LTM represents an image containing objects in a hierarchical way by over-segmented image regions of homogeneous appearances and the salient image patches within the regions, enforcing the spatial coherency of the model.
Cross-dataset action detection
TLDR
This paper proposes an adaptive action detection approach which reduces the requirement of training labels and is able to handle the task of cross-dataset action detection with few or no extra training labels, and combines model adaptation and action detection into a Maximum a Posterior (MAP) estimation framework.
Geographical topic discovery and comparison
TLDR
The results confirm the hypothesis that the geographical distributions can help modeling topics, while topics provide important cues to group different geographical regions.
Designing Category-Level Attributes for Discriminative Visual Recognition
TLDR
A novel formulation to automatically design discriminative "category-level attributes", which can be efficiently encoded by a compact category-attribute matrix, which allows to achieve intuitive and critical design criteria (category-separability, learn ability) in a principled way.
Learning from Noisy Labels with Distillation
TLDR
This work proposes a unified distillation framework to use “side” information, including a small clean dataset and label relations in knowledge graph, to “hedge the risk” of learning from noisy labels, and proposes a suite of new benchmark datasets to evaluate this task in Sports, Species and Artifacts domains.
Video2GIF: Automatic Generation of Animated GIFs from Video
TLDR
This work proposes a Robust Deep RankNet that, given a video, generates a ranked list of its segments according to their suitability as GIF, and effectively deals with the noisy web data by proposing a novel adaptive Huber loss in the ranking formulation.
Mining Fashion Outfit Composition Using an End-to-End Deep Learning Approach on Set Data
TLDR
A machine learning system to compose fashion outfits automatically to score fashion outfit candidates based on the appearances and metadata and achieves an AUC of 85% for the scoring component, and an accuracy of 77% for a constrained composition task.
...
...