• Publications
  • Influence
Cross-domain video concept detection using adaptive svms
TLDR
This paper proposes Adaptive Support Vector Machines (A-SVMs) as a general method to adapt one or more existing classifiers of any type to the new dataset and outperforms several baseline and competing methods in terms of classification accuracy and efficiency in cross-domain concept detection in the TRECVID corpus.
Person Re-identification: Past, Present and Future
TLDR
The history of person re-identification and its relationship with image classification and instance retrieval is introduced and two new re-ID tasks which are much closer to real-world applications are described and discussed.
Large-scale concept ontology for multimedia
TLDR
The large-scale concept ontology for multimedia (LSCOM) is the first of its kind designed to simultaneously optimize utility to facilitate end-user access, cover a large semantic space, make automated extraction feasible, and increase observability in diverse broadcast news video data sets.
Practical elimination of near-duplicates from web video search
TLDR
The results of 24 queries in a data set of 12,790 videos retrieved from Google, Yahoo! and YouTube show that this hierarchical approach can dramatically reduce redundant video displayed to the user in the top result set, at relatively small computational cost.
Infrared Patch-Image Model for Small Target Detection in a Single Image
TLDR
Extensive synthetic and real data experiments show that the proposed small target detection method not only works more stably for different target sizes and signal-to-clutter ratio values, but also has better detection performance compared with conventional baseline methods.
A discriminative CNN video representation for event detection
TLDR
This paper proposes using a set of latent concept descriptors as the frame descriptor, which enriches visual information while keeping it computationally affordable, in a new state-of-the-art performance in event detection over the largest video datasets.
Evaluating bag-of-visual-words representations in scene classification
TLDR
This study provides an empirical basis for designing visual-word representations that are likely to produce superior classification performance and applies techniques used in text categorization to generate image representations that differ in the dimension, selection, and weighting of visual words.
MoSIFT: Recognizing Human Actions in Surveillance Videos
TLDR
This paper proposes an algorithm called MoSIFT, which detects interest points and encodes not only their local appearance but also explicitly models local motion, and introduces a bigram model to construct a correlation between local features to capture the more global structure of actions.
Contrastive Adaptation Network for Unsupervised Domain Adaptation
TLDR
This paper proposes Contrastive Adaptation Network optimizing a new metric which explicitly models the intra- class domain discrepancy and the inter-class domain discrepancy, and designs an alternating update strategy for training CAN in an end-to-end manner.
Self-Paced Learning with Diversity
TLDR
This work proposes an approach called self-paced learning with diversity (SPLD) which formalizes the preference for both easy and diverse samples into a general regularization term, independent of the learning objective, and thus can be easily generalized into various learning tasks.
...
1
2
3
4
5
...