Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video

  title={Seeing Invisible Poses: Estimating 3D Body Pose from Egocentric Video},
  author={Hao Jiang and Kristen Grauman},
  journal={2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  • Hao JiangK. Grauman
  • Published 24 March 2016
  • Computer Science
  • 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
Understanding the camera wearers activity is central to egocentric vision, yet one key facet of that activity is inherently invisible to the camera—the wearers body pose. Prior work focuses on estimating the pose of hands and arms when they come into view, but this 1) gives an incomplete view of the full body posture, and 2) prevents any pose estimate at all in many frames, since the hands are only visible in a fraction of daily life activities. We propose to infer the invisible pose of… 

Figures from this paper

Egocentric Pose Estimation from Human Vision Span

  • Hao JiangV. Ithapu
  • Computer Science
    2021 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2021
A novel deep learning system takes advantage of both the dynamic features from camera SLAM and the body shape imagery to estimate egopose from a more natural human vision span, where camera wearer can be seen in the peripheral view and depending on the head pose the wearer may become invisible or has a limited partial view.

Ego-Body Pose Estimation via Ego-Head Pose Estimation

A new method, Ego-Body Pose Estimation via Egocentric video and human motions, that decomposes the problem into two stages, connected by the head motion as an intermediate representation, and integrates SLAM and a learning approach to estimate accurate head motion.

You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions

It is shown that since interactions between individuals often induce a well-ordered series of back-and-forth responses, it is possible to learn a temporal model of the interlinked poses even though one party is largely out of view.

You 2 Me : Inferring Body Pose in Egocentric Video via First and Second Person Interactions by Evonne

This work proposes a learning-based approach to estimate the camera wearer’s 3D body pose from egocentric video sequences by leveraging interactions with another person as a signal inherently linked to the body pose of the first-person subject.

xR-EgoPose: Egocentric 3D Human Pose From an HMD Camera

A new solution to egocentric 3D body pose estimation from monocular images captured from a downward looking fish-eye camera installed on the rim of a head mounted virtual reality device, using a new encoder-decoder architecture with a novel dual branch decoder designed specifically to account for the varying uncertainty in the 2D joint locations.

Estimating Egocentric 3D Human Pose in Global Space

To achieve accurate and temporally stable global poses, a spatio-temporal optimization is performed over a sequence of frames by minimizing heatmap reprojection errors and enforcing local and global body motion priors learned from a mocap dataset.

EgoGlass: Egocentric-View Human Pose Estimation From an Eyeglass Frame

A new egocentric motion-capture device that adds next to no extra burden on the user and a dataset of real people doing a diverse set of actions captured by EgoGlass that achieves state-of-the-art results on xR-EgoPose and is on par with existing method for EgoCap without requiring temporal information or personalization for each individual user.

SelfPose: 3D Egocentric Pose Estimation from a Headset Mounted Camera

We present a solution to egocentric 3D body pose estimation from monocular images captured from downward looking fish-eye cameras installed on the rim of a head mounted VR device. This unusual

3D Ego-Pose Estimation via Imitation Learning

This work proposes a novel control-based approach to model human motion with physics simulation and uses imitation learning to learn a video-conditioned control policy for ego-pose estimation and allows for domain adaption to transfer the policy trained on simulation data to real-world data.



First-person pose recognition using egocentric workspaces

An efficient pipeline is proposed which generates synthetic workspace exemplars for training using a virtual chest-mounted camera whose intrinsic parameters match the authors' physical camera, computes perspective-aware depth features on this entire volume and recognizes discrete arm+hand pose classes through a sparse multi-class SVM.

Action Recognition in the Presence of One Egocentric and Multiple Static Cameras

A model is introduced that can benefit from the best of both worlds by learning to predict the importance of each camera in recognizing actions in each frame by joint discriminative learning of latent camera importance variables and action classifiers.

Delving into egocentric actions

A novel set of egocentric features are presented and shown how they can be combined with motion and object features and a significant performance boost over all previous state-of-the-art methods is uncovered.

EgoCap: egocentric marker-less motion capture with two fisheye cameras

This work proposes a new method for real-time, marker-less, and egocentric motion capture: estimating the full-body skeleton pose from a lightweight stereo pair of fisheye cameras attached to a helmet or virtual reality headset - an optical inside-in method, so to speak.

A survey of human pose estimation: The body parts parsing based methods

Figure-ground segmentation improves handled object recognition in egocentric video

  • Xiaofeng RenChunhui Gu
  • Computer Science
    2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition
  • 2010
This work develops a bottom-up motion-based approach to robustly segment out foreground objects in egocentric video and shows that it greatly improves object recognition accuracy.

Full scaled 3D visual odometry from a single wearable omnidirectional camera

A method to recover the scale of the scene using an omnidirectional camera mounted on a helmet and an empirical formula based on biomedical experiments on human walking to cope with scale drift is presented.

3D human pose from silhouettes by relevance vector regression

  • A. AgarwalB. Triggs
  • Computer Science
    Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004.
  • 2004
This work describes a learning based method for recovering 3D human body pose by direct nonlinear regression against shape descriptor vectors extracted automatically from image silhouettes, and results are a factor of 3 better than the current state of the art for the much simpler upper body problem.

Pixel-Level Hand Detection in Ego-centric Videos

This work presents a fully labeled indoor/outdoor ego-centric hand detection benchmark dataset containing over 200 million labeled pixels, which contains hand images taken under various illumination conditions and highlights the effectiveness of sparse features and the importance of modeling global illumination.

Understanding egocentric activities

This work presents a method to analyze daily activities using video from an egocentric camera, and shows that joint modeling of activities, actions, and objects leads to superior performance in comparison to the case where they are considered independently.