• Publications
  • Influence
HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion
TLDR
A baseline algorithm for 3D articulated tracking that uses a relatively standard Bayesian framework with optimization in the form of Sequential Importance Resampling and Annealed Particle Filtering is described, and a variety of likelihood functions, prior models of human motion and the effects of algorithm parameters are explored.
HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion
TLDR
There is a need for common datasets that allow fair comparison between different methods and their design choices to establish the current state of the art, and it is argued that HumanEva-I will become a standard dataset for the evaluation of articulated human motion and pose estimation.
A Quantitative Evaluation of Video-based 3D Person Tracking
TLDR
The Bayesian estimation of 3D human motion from video sequences is quantitatively evaluated using synchronized, multi-camera, calibrated video and 3D ground truth poses acquired with a commercial motion capture system to suggest that in constrained laboratory environments, current methods perform quite well.
Tracking loose-limbed people
TLDR
The problem of 3D human tracking as one of inference in a graphical model that is a collection of loosely-connected limbs and non-parametric belief propagation using a variation of particle filtering that can be applied over a general loopy graph is posed.
Learning Activity Progression in LSTMs for Activity Detection and Early Detection
TLDR
This work designs novel ranking losses that directly penalize the model on violation of such monotonicities, which are used together with classification loss in training of LSTM models.
Multilevel Language and Vision Integration for Text-to-Clip Retrieval
TLDR
A multilevel model that integrates vision and language features earlier and more tightly than prior work is introduced, and text features are injected early on when generating clip proposals to help eliminate unlikely clips and thus speed up processing and boost performance.
Measure Locally, Reason Globally: Occlusion-sensitive Articulated Pose Estimation
TLDR
An extension of an approximate belief propagation algorithm (PAMPAS) that recovers the real-valued 2D pose of the body in the presence of occlusions, does not require strong priors over body pose and does a quantitatively better job of explaining image evidence than previous methods.
Implicit Probabilistic Models of Human Motion for Synthesis and Tracking
TLDR
A low dimensional linear model of human motion is learned that is used to structure the example motion database into a binary tree and an approximate probabilistic tree search method exploits the coefficients of this low-dimensional representation and runs in sub-linear time.
Image Generation From Layout
TLDR
The proposed Layout2Im model significantly outperforms the previous state of the art, boosting the best reported inception score by 24.66% and 28.57% on the very challenging COCO-Stuff and Visual Genome datasets, respectively.
Detailed Human Shape and Pose from Images
TLDR
This work represents the body using a recently proposed triangulated mesh model called SCAPE which employs a low-dimensional, but detailed, parametric model of shape and pose-dependent deformations that is learned from a database of range scans of human bodies.
...
1
2
3
4
5
...