Learn More
We propose a method for understanding the 3D geometry of indoor environments (e.g. bedrooms, kitchens) while simultaneously identifying objects in the scene (e.g. beds, couches, doors). We focus on how modeling the geometry and location of specific objects is helpful for indoor scene understanding. For example, beds are shorter than they are wide, and are(More)
Figure 1. The use of more specific and detailed geometric models as proposed in this paper enables better understanding of scenes, illustrated here by localizing chairs tucked under the table in 3D. Abstract We develop a comprehensive Bayesian generative model for understanding indoor scenes. While it is common in this domain to approximate objects with 3D(More)
We develop a Bayesian modeling approach for tracking people in 3D from monocular video with unknown cameras. Modeling in 3D provides natural explanations for occlusions and smoothness discontinuities that result from projection, and allows priors on velocity and smoothness to be grounded in physical quantities: meters and seconds vs. pixels and frames. We(More)
We propose an unsupervised approach for discovering characteristic motion patterns in videos of highly articulated objects performing natural, unscripted behaviors, such as tigers in the wild. We discover consistent patterns in a bottom-up manner by analyzing the relative displacements of large numbers of ordered trajectory pairs through time, such that(More)
Given unstructured videos of deformable objects (such as animals in the wild), we automatically recover spa-tiotemporal correspondences to map one object to another. In contrast to traditional methods based on appearance, which fail in such challenging conditions, we exploit consistency in observed object motion between instances. Our approach discovers(More)
We propose an automatic system for organizing the content of a collection of unstructured videos of an articulated object class (e.g., tiger, horse). By exploiting the recurring motion patterns of the class across videos, our system: (1) identifies its characteristic behaviors, and (2) recovers pixel-to-pixel alignments across different instances. Our(More)
We present a method for automatically aligning words to image regions that integrates specific object classifiers (e.g., "car" detectors) with weak models based on appearance features. Previous strategies have largely focused on the latter, and thus have not exploited progress on object category recognition. Hence, we augment region labeling with object(More)
We propose a motion-based method to discover the physical parts of an articulated object class (e.g. head/torso/leg of a horse) from multiple videos. The key is to find object regions that exhibit consistent motion relative to the rest of the object, across multiple videos. We can then learn a location model for the parts and segment them accurately in the(More)
  • 1