• Publications
  • Influence
Bottom-Up and Top-Down Attention for Image Captioning and VQA
tl;dr
We propose a combined bottom-up and topdown visual attention mechanism that enables attention to be calculated at the level of objects and other salient image regions, while the top-down mechanism determines feature weightings. Expand
  • 231
  • 74
Stacked Cross Attention for Image-Text Matching
tl;dr
In this paper, we present Stacked Cross Attention to discover the full latent alignments using both image regions and words in sentence as context and infer the image-text similarity. Expand
  • 184
  • 67
  • Open Access
Deep Learning with Low Precision by Half-Wave Gaussian Quantization
tl;dr
The problem of quantizing the activations of a deep neural network is considered. Expand
  • 226
  • 52
  • Open Access
A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories
tl;dr
Representation and learning of commonsense knowledge is one of the foundational problems in the quest to enable deep language understanding. Expand
  • 172
  • 37
  • Open Access
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
tl;dr
This paper presents a relatively simple model for VQA that achieves state-of-the-art results. Expand
  • 184
  • 33
  • Open Access
A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems
tl;dr
We use a Deep Learning approach to map users and items to a latent space where the similarity between users and their preferred items is maximized. Expand
  • 344
  • 31
  • Open Access
CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
tl;dr
We introduce CleanNet, a joint neural embedding network, which only requires a fraction of the classes being manually verified to provide the knowledge of label noise that can be transferred to other classes. Expand
  • 98
  • 19
  • Open Access
A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories
tl;dr
This paper introduces a new corpus of ~50k five-sentence commonsense stories, ROCStories, to enable this evaluation. Expand
  • 78
  • 15
  • Open Access
End-to-end Structure-Aware Convolutional Networks for Knowledge Base Completion
tl;dr
We propose a novel end-to-end StructureAware Convolutional Network (SACN) that takes the benefit of GCN and ConvE together. Expand
  • 40
  • 12
  • Open Access
Multi-Rate Deep Learning for Temporal Recommendation
tl;dr
We propose a novel deep neural network based architecture that models the combination of long-term static and short-term temporal user preferences to improve the recommendation performance. Expand
  • 95
  • 9
  • Open Access