Learn More
We describe an approach for unsupervised learning of a generic, distributed sentence encoder. Using the continuity of text from books, we train an encoderdecoder model that tries to reconstruct the surrounding sentences of an encoded passage. Sentences that share semantic and syntactic properties are thus mapped to similar vector representations. We next(More)
The goal of this paper is to generate high-quality 3D object proposals in the context of autonomous driving. Our method exploits stereo imagery to place proposals in the form of 3D bounding boxes. We formulate the problem as minimizing an energy function encoding object size priors, ground plane as well as several depth informed features that reason about(More)
We introduce the MovieQA dataset which aims to evaluate automatic story comprehension from both video and text. The dataset consists of 14,944 questions about 408 movies with high semantic diversity. The questions range from simpler "Who" did "What" to "Whom", to "Why" and "How" certain events occurred. Each question comes with a set of five possible(More)
Books are a rich source of both fine-grained information, how a character, an object or a scene looks like, as well as high-level semantics, what someone is thinking, feeling and how these states evolve through a story. This paper aims to align books to their movie releases in order to provide rich descriptive explanations for visual content that go(More)
In this paper, we propose an approach that exploits object segmentation in order to improve the accuracy of object detection. We frame the problem as inference in a Markov Random Field, in which each detection hypothesis scores object appearance as well as contextual information using Convolutional Neural Networks, and allows the hypothesis to choose and(More)
The goal of this paper is to perform 3D object detection in the context of autonomous driving. Our method aims at generating a set of high-quality 3D object proposals by exploiting stereo imagery. We formulate the problem as minimizing an energy function that encodes object size priors, placement of objects on the ground plane as well as several depth(More)
This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problemagnostic. It is therefore applicable without any modifications to a wide range of computer vision problems such as image(More)
The challenge in photothermal therapy (PTT) is to develop biocompatible photothermal transducers that can absorb and convert near-infrared (NIR) light into heat with high efficiency. Herein, we report salt-induced aggregation of gold nanoparticles (GNPs) in biological media to form highly efficient and biocompatible NIR photothermal transducers for PTT and(More)
Photoacoustic imaging and fluorescence molecular imaging are emerging as important research tools for biomedical studies. Photoacoustic imaging offers both strong optical absorption contrast and high ultrasonic resolution, and fluorescence molecular imaging provides excellent superficial resolution, high sensitivity, high throughput, and the ability for(More)