• Publications
  • Influence
Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments
We introduce a new dataset, Human3.6M, of 3.6 Million accurate 3D Human poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for trainingExpand
  • 833
  • 103
  • PDF
CPMC: Automatic Object Segmentation Using Constrained Parametric Min-Cuts
We present a novel framework to generate and rank plausible hypotheses for the spatial extent of objects in images using bottom-up computational processes and mid-level selection cues. The objectExpand
  • 579
  • 71
  • PDF
Constrained parametric min-cuts for automatic object segmentation
We present a novel framework for generating and ranking plausible objects hypotheses in an image using bottom-up processes and mid-level cues. The object hypotheses are represented as figure-groundExpand
  • 469
  • 52
  • PDF
Twin Gaussian Processes for Structured Prediction
We describe twin Gaussian processes (TGP), a generic structured prediction method that uses Gaussian process (GP) priors on both covariates and responses, both multivariate, and estimates outputs byExpand
  • 253
  • 49
  • PDF
Semantic Segmentation with Second-Order Pooling
Feature extraction, coding and pooling, are important components on many contemporary object recognition paradigms. In this paper we explore novel pooling techniques that encode the second-orderExpand
  • 417
  • 47
  • PDF
The Moving Pose: An Efficient 3D Kinematics Descriptor for Low-Latency Action Recognition and Detection
Human action recognition under low observational latency is receiving a growing interest in computer vision due to rapidly developing technologies in human-robot interaction, computer gaming andExpand
  • 353
  • 38
  • PDF
Matrix Backpropagation for Deep Networks with Structured Layers
Deep neural network architectures have recently produced excellent results in a variety of areas in artificial intelligence and visual recognition, well surpassing traditional shallow architecturesExpand
  • 143
  • 34
  • PDF
Efficient Match Kernel between Sets of Features for Visual Recognition
In visual recognition, the images are frequently modeled as unordered collections of local features (bags). We show that bag-of-words representations commonly used in conjunction with linearExpand
  • 245
  • 32
  • PDF
Actions in the Eye: Dynamic Gaze Datasets and Learnt Saliency Models for Visual Recognition
Systems based on bag-of-words models from image features collected at maxima of sparse interest point operators have been used successfully for both computer visual object and action recognitionExpand
  • 123
  • 31
  • PDF
Discriminative density propagation for 3D human motion estimation
We describe a mixture density propagation algorithm to estimate 3D human motion in monocular video sequences based on observations encoding the appearance of image silhouettes. Our approach isExpand
  • 277
  • 27
  • PDF