Benchmarking and Error Diagnosis in Multi-instance Pose Estimation

  title={Benchmarking and Error Diagnosis in Multi-instance Pose Estimation},
  author={Matteo Ruggero Ronchi and Pietro Perona},
  journal={2017 IEEE International Conference on Computer Vision (ICCV)},
  • M. R. Ronchi, P. Perona
  • Published 17 July 2017
  • Computer Science
  • 2017 IEEE International Conference on Computer Vision (ICCV)
We propose a new method to analyze the impact of errors in algorithms for multi-instance pose estimation and a principled benchmark that can be used to compare them. We define and characterize three classes of errors - localization, scoring, and background - study how they are influenced by instance attributes and their impact on an algorithm’s performance. Our technique is applied to compare the two leading methods for human pose estimation on the COCO Dataset, measure the sensitivity of pose… 
PoseFix: Model-Agnostic General Human Pose Refinement Network
This paper proposes a human pose refinement network that estimates a refined pose from a tuple of an input image and input pose and shows that the proposed approach achieves better performance than the conventional multi-stage refinement models and consistently improves the performance of various state-of-the-art pose estimation methods on the commonly used benchmark.
Multi-Instance Pose Networks: Rethinking Top-Down Pose Estimation
This work introduces a Multi-Instance Modulation Block (MIMB) that can adaptively modulate channel-wise feature responses for each instance and is parameter efficient and demonstrates the efficacy of the approach by evaluating on COCO, CrowdPose, and OCHuman datasets.
Learning to Acquire the Quality of Human Pose Estimation
End-to-end human pose quality learning is proposed, which adds a quality prediction block alongside pose regression that learns the object keypoint similarity (OKS) between the estimated pose and its corresponding ground truth by sharing the pose features with heatmap regression.
FastPose: Towards Real-time Pose Estimation and Tracking via Scale-normalized Multi-task Networks
This paper addresses the task of articulated multi-person pose estimation and tracking towards real-time speed by adopting an occlusion-aware Re-ID feature strategy in the pose tracking module of an end-to-end multi-task network (MTN).
Precise human pose estimation based on two-dimensional images for kinematic analysis
This work developed a simple yet useful deep learning algorithm for Human Pose Estimation that uses as input only an image of a scene with people, aiming to preserve more precise pixel location.
Exploiting Offset-guided Network for Pose Estimation and Tracking
The Offset-guided Network (OGN) is proposed with an intuitive but effective fusion strategy for both two-stages pose estimation and Mask R-CNN, and a greedy box generation strategy is also proposed to keep more necessary candidates while performing person detection.
VRU Pose-SSD: Multiperson Pose Estimation For Automated Driving
A fast and efficient approach for joint person detection and pose estimation optimized for automated driving (AD) in urban scenarios and introduces a two-stage evaluation strategy, which is more suitable for AD and achieves a significant performance improvement in comparison to state-ofthe-art approaches.
Removing the Bias of Integral Pose Regression
This paper investigates the difference in supervision between the heatmap-based detection and integral regression and proposes a simple combined detection and bias-compensated regression method that considerably outperforms state-of-the-art baselines with few added components.
Using Diverse Neural Networks for Safer Human Pose Estimation: Towards Making Neural Networks Know When They Don’t Know
This work proposes a method to identify and eliminate false detections by comparing keypoint detections from different neural networks and assigning a ’Don’t know’ label in the case of a mismatch, driven by the principle of software diversity.
It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data
This work proposes a 3D human pose estimation algorithm that only requires relative estimates of depth at training time, which opens the door to using existing widespread 2D datasets for 3D pose estimation by allowing fine-tuning with noisy relative constraints, resulting in more accurate 3D poses.


Towards Accurate Multi-person Pose Estimation in the Wild
This work proposes a method for multi-person detection and 2-D pose estimation that achieves state-of-art results on the challenging COCO keypoints task by using a novel form of keypoint-based Non-Maximum-Suppression (NMS), instead of the cruder box-level NMS, and by introducing a novel aggregation procedure to obtain highly localized keypoint predictions.
Articulated people detection and pose estimation: Reshaping the future
This work proposes a new technique to extend an existing training set that allows to explicitly control pose and shape variations and defines a new challenge of combined articulated human detection and pose estimation in real-world scenes.
2D Human Pose Estimation: New Benchmark and State of the Art Analysis
A novel benchmark "MPII Human Pose" is introduced that makes a significant advance in terms of diversity and difficulty, a contribution that is required for future developments in human body models.
Poselets: Body part detectors trained using 3D human pose annotations
A new dataset, H3D, is built of annotations of humans in 2D photographs with 3D joint information, inferred using anthropometric constraints, to address the classic problems of detection, segmentation and pose estimation of people in images with a novel definition of a part, a poselet.
DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model
The goal of this paper is to advance the state-of-the-art of articulated pose estimation in scenes with multiple people. To that end we contribute on three fronts. We propose (1) improved body part
Learning effective human pose estimation from inaccurate annotation
A significant increase in pose estimation accuracy is demonstrated, while simultaneously reducing computational expense by a factor of 10, and a dataset of10,000 highly articulated poses is contributed.
Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields
We present an approach to efficiently detect the 2D pose of multiple people in an image. The approach uses a nonparametric representation, which we refer to as Part Affinity Fields (PAFs), to learn
DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation
An approach that jointly solves the tasks of detection and pose estimation: it infers the number of persons in a scene, identifies occluded body parts, and disambiguates body parts between people in close proximity of each other is proposed.
Pose Machines: Articulated Pose Estimation via Inference Machines
This paper builds upon the inference machine framework and presents a method for articulated human pose estimation that incorporates rich spatial interactions among multiple parts and information across parts of different scales and outperforms the state-of-the-art on these benchmarks.
Clustered Pose and Nonlinear Appearance Models for Human Pose Estimation
A new annotated database of challenging consumer images is introduced, an order of magnitude larger than currently available datasets, and over 50% relative improvement in pose estimation accuracy over a state-of-the-art method is demonstrated.