• Publications
  • Influence
Understanding deep image representations by inverting them
Image representations, from SIFT and Bag of Visual Words to Convolutional Neural Networks (CNNs), are a crucial component of almost any image understanding system. Nevertheless, our understanding of
Object-Centric Learning with Slot Attention
TLDR
An architectural component that interfaces with perceptual representations such as the output of a convolutional neural network and produces a set of task-dependent abstract representations which are exchangeable and can bind to any object in the input by specializing through a competitive procedure over multiple rounds of attention is presented.
Visualizing Deep Convolutional Neural Networks Using Natural Pre-images
TLDR
This paper studies several landmark representations, both shallow and deep, by a number of complementary visualization techniques based on the concept of “natural pre-image”, and shows that several layers in CNNs retain photographically accurate information about the image, with different degrees of geometric and photometric invariance.
Salient Deconvolutional Networks
TLDR
A family of reversed networks is introduced that generalizes and relates deconvolution, backpropagation and network saliency, and is used to thoroughly investigate and compare these methods in terms of quality and meaning of the produced images, and of what architectural choices are important in determining these properties.
Differentiable Patch Selection for Image Recognition
TLDR
A method based on a differentiable Top-K operator to select the most relevant parts of the input to efficiently process high resolution images and shows results for traffic sign recognition, inter-patch relationship reasoning, and fine-grained recognition without using object/part bounding box annotations during training.
Cross Pixel Optical Flow Similarity for Self-Supervised Learning
TLDR
This work uses motion cues in the form of optical flow, to supervise representations of static images, and achieves state-of-the-art results in self-supervision using motion cues, competitive results for self- supervision in general, and is overall state of the art inSelf-supervised pretraining for semantic image segmentation.
The Potential of Antiviral Peptides as COVID-19 Therapeutics
TLDR
Interestingly, there are AVPs demonstrated to exert prophylactic and therapeutic effects against coronaviruses (CoVs) and it is suggested that further development of this class of compounds in the face of the current pandemic threat is warranted.
Self-Supervised Learning of Video-Induced Visual Invariances
TLDR
Training models using different variants of the proposed framework on videos from the YouTube-8M (YT8M) data set obtain state-of-the-art self-supervised transfer learning results on the 19 diverse downstream tasks of the Visual Task Adaptation Benchmark (VTAB), using only 1000 labels per task.
Heterogeneous UGV-MAV exploration using integer programming
TLDR
Optimization seamlessly integrates several practical constraints that arise in exploration between such heterogeneous agents and provides an elegant solution for assigning task to agents.
ResearchDoom and CocoDoom: Learning Computer Vision with Games
TLDR
ResearchDoom and CocoDoom can be used to train and evaluate a variety of computer vision methods such as object recognition, detection and segmentation at the level of instances and categories, tracking, ego-motion estimation, monocular depth estimation and scene segmentation.
...
1
2
3
...