• Publications
  • Influence
Long-term recurrent convolutional networks for visual recognition and description
TLDR
We develop a novel recurrent convolutional architecture suitable for large-scale visual learning which is end-to-end trainable and demonstrate the value of these models on benchmark video recognition tasks, image description and retrieval problems, and video narration challenges. Expand
Sequence to Sequence -- Video to Text
TLDR
We propose a novel end-to-end sequence to sequence model to generate captions for videos. Expand
Multimodal Compact Bilinear Pooling for Visual Question Answering and Visual Grounding
TLDR
We propose to rely on Multimodal Compact Bilinear Pooling (MCB) to efficiently and expressively combine multimodal features. Expand
Efficient Lifelong Learning with A-GEM
TLDR
In lifelong learning, the learner is presented with a sequence of tasks, incrementally building a data-driven prior which may be leveraged to speed up learning of a new task. Expand
A database for fine grained activity detection of cooking activities
TLDR
We propose a novel dataset of 65 fine-grained cooking activities, continuously recorded in a realistic setting. Expand
Neural Module Networks
TLDR
We describe a method for constructing and learning neural module networks, which compose collections of jointly-trained neural "modules" into deep networks for question answering. Expand
Memory Aware Synapses: Learning what (not) to forget
TLDR
We propose a novel approach for lifelong learning, coined Memory Aware Synapses, which computes the importance of the parameters of a neural network in an unsupervised and online manner. Expand
Grounding of Textual Phrases in Images by Reconstruction
TLDR
We propose a novel approach to grounding of textual phrases in images which can operate in all supervision modes: with no, a few, or all grounding annotations available. Expand
Learning to Reason: End-to-End Module Networks for Visual Question Answering
TLDR
We propose End-to-End Module Networks (N2NMNs), which learn to reason by directly predicting instance-specific network layouts without the aid of a parser. Expand
Ask Your Neurons: A Neural-Based Approach to Answering Questions about Images
TLDR
We propose Neural-Image-QA, an approach to question answering with a recurrent neural network. Expand
...
1
2
3
4
5
...