Georgios Pavlakos

Learn More
This paper addresses the challenge of 3D human pose estimation from a single color image. Despite the general success of the end-to-end learning paradigm, top performing approaches employ a two-step solution consisting of a Convolutional Network (ConvNet) for 2D joint localization and a subsequent optimization step to recover 3D pose. In this paper, we(More)
Recent advances with Convolutional Networks (ConvNets) have shifted the bottleneck for many computer vision tasks to annotated data collection. In this paper, we present a geometry-driven approach to automatically collect annotations for human pose prediction tasks. Starting from a generic ConvNet for 2D human pose, and assuming a multi-view setup, we(More)
Recovering 3D full-body human pose is a challenging problem with many applications. It has been successfully addressed by motion capture systems with body worn markers and multiple cameras. In this paper, we address the more challenging case of not only using a single camera but also not leveraging markers: going directly from 2D appearance to 3D geometry.(More)
Mobility disabilities are prevalent in our ageing society and impede activities important for the independent living of elderly people and their quality of life. The goal of this work is to support human mobility and thus enforce fitness and vitality by developing intelligent robotic platforms designed to provide usercentred and natural support for(More)
We present a new framework for multimodal gesture recognition that is based on a two-pass fusion scheme. In this, we deal with a demanding Kinect-based multimodal dataset, which was introduced in a recent gesture recognition challenge. We employ multiple modalities, i.e., visual cues, such as colour and depth images, as well as audio, and we specifically(More)
This paper presents a novel approach to estimating the continuous six degree of freedom (6-DoF) pose (3D translation and rotation) of an object from a single RGB image. The approach combines semantic keypoints predicted by a convolutional network (convnet) with a deformable shape model. Unlike prior work, we are agnostic to whether the object is textured or(More)
In this supplementary, we provide material that could not be included in the main manuscript due to space constraints. First, Section 1 provides additional details for the exact representation of the volumetric space and the way we can obtain metric pose estimates from the voxelized estimates. Section 2 presents full results on Human3.6M using the(More)
In this supplementary, we provide material that could not be included in the main manuscript due to space constraints. Section 1 provides additional quantitative evaluation of our approach for multi-view pose estimation, and comparison with the state-of-the-art for HumanEva-I [4]. Section 2 provides full results of the multi-view optimization on Human3.6M(More)
We aim at developing an intelligent robotic platform that provides cognitive assistance and natural support in indoor environments to the elderly society and to individuals with moderate to mild walking impairment. Towards this end, we process data from audiovisual sensors and laser range scanners, acquired in experiments with patients in real life(More)
  • 1