• Publications
  • Influence
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks
TLDR
The Binary-Weight-Network version of AlexNet is compared with recent network binarization methods, BinaryConnect and BinaryNets, and outperform these methods by large margins on ImageNet, more than \(16\,\%\) in top-1 accuracy.
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation
TLDR
A fast and efficient convolutional neural network, ESPNet, for semantic segmentation of high resolution images under resource constraints, which outperforms all the current efficient CNN networks such as MobileNet, ShuffleNet, and ENet on both standard metrics and the newly introduced performance metrics that measure efficiency on edge devices.
Attribute Discovery via Predictable Discriminative Binary Codes
TLDR
In this work, each image claims its own code in a way that maintains discrimination while being predictable from visual data, and this method outperforms state-of-the-art binary code methods on this large scale dataset.
ESPNetv2: A Light-Weight, Power Efficient, and General Purpose Convolutional Neural Network
We introduce a light-weight, power efficient, and general purpose convolutional neural network, ESPNetv2, for modeling visual and sequential data. Our network uses group point-wise and depth-wise
OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge
TLDR
This paper addresses the task of knowledge-based visual question answering and provides a benchmark, called OK-VQA, where the image content is not sufficient to answer the questions, encouraging methods that rely on external knowledge resources.
What’s Hidden in a Randomly Weighted Neural Network?
TLDR
It is empirically show that as randomly weighted neural networks with fixed weights grow wider and deeper, an ``untrained subnetwork" approaches a network with learned weights in accuracy.
Predictable Dual-View Hashing
We propose a Predictable Dual-View Hashing (PDH) algorithm which embeds proximity of data samples in the original spaces. We create a cross-view hamming space with the ability to compare information
Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning
TLDR
A self-adaptive visual navigation method (SAVN) which learns to adapt to new environments without any explicit supervision which shows major improvements in both success rate and SPL for visual navigation in novel scenes.
IQA: Visual Question Answering in Interactive Environments
TLDR
The Hierarchical Interactive Memory Network (HIMN), consisting of a factorized set of controllers, allowing the system to operate at multiple levels of temporal abstraction, is proposed, and outperforms popular single controller based methods on IQUAD V1.
Label Refinery: Improving ImageNet Classification through Label Progression
TLDR
The effects of various properties of labels are studied, an iterative procedure that updates the ground truth labels after examining the entire dataset is introduced, and significant gain is shown using refined labels across a wide range of models.
...
1
2
3
4
5
...