• Publications
  • Influence
Looking for the Devil in the Details: Learning Trilinear Attention Sampling Network for Fine-Grained Image Recognition
TLDR
TASN consists of a trilinear attention module, which generates attention maps by modeling the inter-channel relationships, an attention-based sampler which highlights attended parts with high resolution, and a feature distiller, which distills part features into an object-level feature by weight sharing and feature preserving strategies.
Adaptive Transfer Network for Cross-Domain Person Re-Identification
TLDR
A novel adaptive transfer network (ATNet) for effective cross-domain person re-identification that decomposes the complicated cross- domain transfer into a set of factor-wise sub-transfers and gives ATNet the capability of precise style transfer at factor level and eventually effective transfer across domains.
Joint multi-label multi-instance learning for image classification
TLDR
This work proposes an integrated multi- label multi-instance learning (MLMIL) approach based on hidden conditional random fields (HCRFs), which simultaneously captures both the connections between semantic labels and regions, and the correlations among the labels in a single formulation.
Mining Travel Patterns from Geotagged Photos
TLDR
This study aims to leverage the wealth of these enriched online photos to analyze people’s travel patterns at the local level of a tour destination by building a statistically reliable database of travel paths from a noisy pool of community-contributed geotagged photos on the Internet.
Abstract Reasoning with Distracting Features
TLDR
This paper proposes feature robust abstract reasoning (FRAR) model, which consists of a reinforcement learning based teacher network to determine the sequence of training and a student network for predictions that is able to beat the state-of-the-art models.
Object Relational Graph With Teacher-Recommended Learning for Video Captioning
TLDR
This paper proposes an object relational graph (ORG) based encoder, which captures more detailed interaction features to enrich visual representation and designs a teacher-recommended learning method to make full use of the successful external language model (ELM) to integrate the abundant linguistic knowledge into the caption model.
Aspect Ranking: Identifying Important Product Aspects from Online Consumer Reviews
TLDR
This paper develops an aspect ranking algorithm, which aims to automatically identify important product aspects from online consumer reviews by simultaneously considering the aspect frequency and the influence of consumers' opinions given to each aspect on their overall opinions.
MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition
TLDR
A Mixed Convolutional Tube (MiCT) is proposed that integrates 2D CNNs with the 3D convolution module to generate deeper and more informative feature maps, while reducing training complexity in each round of spatio-temporal fusion.
Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search
TLDR
An approach that simultaneously utilizes both visual and textual information to estimate the relevance of user tagged images is proposed, and the relevance estimation is determined with a hypergraph learning approach.
Camera Lens Super-Resolution
TLDR
This paper investigates SR from the perspective of camera lenses, named as CameraSR, which aims to alleviate the intrinsic tradeoff between resolution (R) and field-of-view (V) in realistic imaging systems and quantitatively analyzes the performance of commonly-used synthetic degradation models.
...
1
2
3
4
5
...