Learn More
This paper proposes a method for capturing the performance of a human or an animal from a multi-view video sequence. Given an articulated template model and silhouettes from a multi-view image sequence, our approach recovers not only the movement of the skeleton, but also the possibly non-rigid temporal deformation of the 3D surface. While large scale(More)
Generation and animation of realistic humans is an essential part of many projects in today's media industry. Especially, the games and special effects industry heavily depend on realistic human animation. In this work a unified model that describes both, human pose and body shape is introduced which allows us to accurately model muscle deformations not(More)
Multiple people tracking consists in detecting the subjects at each frame and matching these detections to obtain full trajectories. In semi-crowded environments, pedestrians often occlude each other, making tracking a challenging task. Most tracking methods make the assumption that each pedestrian's motion is independent, thereby ignoring the complex and(More)
In this paper a method for estimating a rigid skeleton, including skinning weights, skeleton connectivity, and joint positions, given a sparse set of example poses is presented. In contrast to other methods, we are able to simultaneously take examples of different subjects into account, which improves the robustness of the estimation. It is additionally(More)
In this article we present the integration of 3-D shape knowledge into a variational model for level set based image segmentation and tracking. Given a 3-D surface model of an object that is visible in the image of one or multiple cameras calibrated to the same world coordinate system, the object contour extracted by the segmentation method is applied to(More)
In this work we present an approach for markerless motion capture (MoCap) of articulated objects, which are recorded with multiple unsynchronized moving cameras. Instead of using fixed (and expensive) hardware synchronized cameras, this approach allows us to track people with off-the-shelf handheld video cameras. To prepare a sequence for motion capture, we(More)
In this paper, we propose the combined use of complementary concepts for 3D tracking: region fitting on one side and dense optical flow as well as tracked SIFT features on the other. Both concepts are chosen such that they can compensate for the shortcomings of each other. While tracking by the object region can prevent the accumulation of errors, optical(More)
We present a new algorithm to jointly track multiple objects in multi-view images. While this has been typically addressed separately in the past, we tackle the problem as a single global optimization. We formulate this assignment problem as a min-cost problem by defining a graph structure that captures both temporal correlations between objects as well as(More)
In this paper, we propose a method for learning a class representation that can return a continuous value for the pose of an unknown class instance using only 2D data and weak 3D labeling information. Our method is based on generative feature models, i.e., regression functions learned from local descriptors of the same patch collected under different(More)
Gesture recognition remains a very challenging task in the field of computer vision and human computer interaction (HCI). A decade ago the task seemed to be almost unsolvable with the data provided by a single RGB camera. Due to recent advances in sensing technologies, such as time-of-flight and structured light cameras, there are new data sources(More)