• Publications
  • Influence
iCoseg: Interactive co-segmentation with intelligent scribble guidance
iCoseg, an automatic recommendation system that intelligently recommends where the user should scribble next, is proposed, and users following these recommendations can achieve good quality cutouts with significantly lower time and effort than exhaustively examining all cutouts.
StereoNet: Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction
This paper presents StereoNet, the first end-to-end deep architecture for real-time stereo matching that runs at 60fps on an NVidia Titan X, producing high-quality, edge-preserved, quantization-free
Fusion4D: real-time performance capture of challenging scenes
This work contributes a new pipeline for live multi-view performance capture, generating temporally coherent high-quality reconstructions in real-time, highly robust to both large frame-to-frame motion and topology changes, allowing us to reconstruct extremely challenging scenes.
Holoportation: Virtual 3D Teleportation in Real-time
This paper demonstrates high-quality, real-time 3D reconstructions of an entire space, including people, furniture and objects, using a set of new depth cameras, and allows users wearing virtual or augmented reality displays to see, hear and interact with remote participants in 3D, almost as if they were present in the same physical space.
Interactively Co-segmentating Topically Related Images with Intelligent Scribble Guidance
An algorithm that allows users to decide what foreground is, and then guide the output of the co-segmentation algorithm towards it via scribbles, which shows that keeping a user in the loop leads to simpler and highly parallelizable energy functions, allowing us to work with significantly more images per group.
Multiple View Object Cosegmentation Using Appearance and Stereo Cues
This work presents an automatic approach to segment an object in calibrated images acquired from multiple viewpoints, formulated using an energy minimization framework that combines stereo and appearance cues, where for each surface, an appearance model is learnt using an unsupervised approach.
ActiveStereoNet: End-to-End Self-Supervised Learning for Active Stereo Systems
This paper presents ActiveStereoNet, the first deep learning solution for active stereo systems that is fully self-supervised, yet it produces precise depth with a subpixel precision; it does not suffer from the common over-smoothing issues; it preserves the edges; and it explicitly handles occlusions.
Toward Holistic Scene Understanding: Feedback Enabled Cascaded Classification Models
This work proposes Feedback Enabled Cascaded Classification Models (FE-CCM), a two-layer cascade of classifiers that jointly optimizes all the subtasks while requiring only a “black box” interface to the original classifier for each subtask.
Motion2fusion: real-time volumetric performance capture
This work provides three major contributions over prior work: a new non-rigid fusion pipeline allowing for far more faithful reconstruction of high frequency geometric details, avoiding the over-smoothing and visual artifacts observed previously, a high speed pipeline coupled with a machine learning technique for 3D correspondence field estimation reducing tracking errors and artifacts that are attributed to fast motions.
Articulated distance fields for ultra-fast tracking of hands interacting
An articulated signed distance function is constructed that, for any pose, yields a closed form calculation of both the distance to the detailed surface geometry and the necessary derivatives to perform gradient based optimization.