• Publications
  • Influence
MonoSLAM: Real-Time Single Camera SLAM
TLDR
The first successful application of the SLAM methodology from mobile robotics to the "pure vision" domain of a single uncontrolled camera, achieving real time but drift-free performance inaccessible to structure from motion approaches is presented. Expand
MOT16: A Benchmark for Multi-Object Tracking
TLDR
A new release of the MOTChallenge benchmark, which focuses on multiple people tracking, and offers a significant increase in the number of labeled boxes, but also provides multiple object classes beside pedestrians and the level of visibility for every single object of interest. Expand
RefineNet: Multi-path Refinement Networks for High-Resolution Semantic Segmentation
TLDR
RefineNet is presented, a generic multi-path refinement network that explicitly exploits all the information available along the down-sampling process to enable high-resolution prediction using long-range residual connections and introduces chained residual pooling, which captures rich background context in an efficient manner. Expand
Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age
TLDR
What is now the de-facto standard formulation for SLAM is presented, covering a broad set of topics including robustness and scalability in long-term mapping, metric and semantic representations for mapping, theoretical performance guarantees, active SLAM and exploration, and other new frontiers. Expand
Articulated body motion capture by annealed particle filtering
TLDR
The principal contribution of the paper is the development of a modified particle filter for search in high dimensional configuration spaces that uses a continuation principle based on annealing to introduce the influence of narrow peaks in the fitness function, gradually. Expand
Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments
TLDR
This work provides the first benchmark dataset for visually-grounded natural language navigation in real buildings - the Room-to-Room (R2R) dataset and presents the Matter-port3D Simulator - a large-scale reinforcement learning environment based on real imagery. Expand
Articulated Body Motion Capture by Stochastic Search
TLDR
A modified particle filter is developed which is shown to be effective at searching the high-dimensional configuration spaces encountered in visual tracking of articulated body motion and to be capable of recovering full articulated bodymotion efficiently. Expand
Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue
TLDR
This work proposes a unsupervised framework to learn a deep convolutional neural network for single view depth prediction, without requiring a pre-training stage or annotated ground-truth depths, and shows that this network trained on less than half of the KITTI dataset gives comparable performance to that of the state-of-the-art supervised methods for singleView depth estimation. Expand
Stable multi-target tracking in real-time surveillance video
TLDR
This work presents a multi-target tracking system that is designed specifically for the provision of stable and accurate head location estimates and uses a more principled approach based on a Minimal Description Length (MDL) objective which accurately models the affinity between observations. Expand
Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression
TLDR
This paper introduces a generalized version of IoU ( GIoU) as a loss into the state-of-the art object detection frameworks, and shows a consistent improvement on their performance using both the standard, IoU based, and new, GIo U based, performance measures on popular object detection benchmarks. Expand
...
1
2
3
4
5
...