AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild

  title={AdaFuse: Adaptive Multiview Fusion for Accurate Human Pose Estimation in the Wild},
  author={Zhe Zhang and Chunyu Wang and Weichao Qiu and Wenhu Qin and Wenjun Zeng},
  journal={Int. J. Comput. Vis.},
Occlusion is probably the biggest challenge for human pose estimation in the wild. Typical solutions often rely on intrusive sensors such as IMUs to detect occluded joints. To make the task truly unconstrained, we present AdaFuse, an adaptive multiview fusion method, which can enhance the features in occluded views by leveraging those in visible views. The core of AdaFuse is to determine the point-point correspondence between two views which we solve effectively by exploring the sparsity of the… Expand
Deep Learning-Based Human Pose Estimation: A Survey
A comprehensive survey of deep learning based human pose estimation methods and analyzes the methodologies employed and summarizes and discusses recent works with a methodology-based taxonomy. Expand
Adaptively Multi-view and Temporal Fusing Transformer for 3D Human Pose Estimation
Compared with state-of-the-art methods with camera parameters, experiments show that MTF-Transformer not only obtains comparable results but also generalizes well to dynamic capture with an arbitrary number of unseen views. Expand
Semantically Synchronizing Multiple-Camera Systems with Human Pose Estimation
This work proposes an out-of-the-box framework to temporally synchronize multiple cameras using semantic human pose estimation from the videos to derive the optimal temporal displacement configuration for the multiple-camera system. Expand
Semi-supervised Dense Keypointsusing Unlabeled Multiview Images
This paper presents a new end-to-end semi-supervised framework to learn a dense keypoint detector using unlabeled multiview images, and designs a new neural network architecture that effectively minimizes the probabilistic epipolar errors of all possible correspondences between two view images by building affinity matrices. Expand
Delve into balanced and accurate approaches for ship detection in aerial images
  • Boyong He, Bo Huang, Yue Shen, Liaoni Wu
  • Computer Science
  • Neural Computing and Applications
  • 2021
This paper uses the virtual 3D engine to create scenes with ship objects and annotate the collected images with bounding boxes automatically to generate the synthetic ship detection dataset, called unreal-ship, and designs an efficient anchor generation structure Guided Anchor, utilizing the semantic information to guide and generate high-quality anchors. Expand
TransFusion: Cross-view Fusion with Transformer for 3D Human Pose Estimation
  • Haoyu Ma, Liangjian Chen, +7 authors Xiaohui Xie
  • Computer Science
  • ArXiv
  • 2021
A transformer framework for multi-view 3D pose estimation, aiming at directly improving individual 2D predictors by integrating information from different views is introduced, and the concept of epipolar field to encode 3D positional information into the transformer model is proposed. Expand


MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation
MetaFuse is introduced, a pre-trained fusion model learned from a large number of cameras in the Panoptic dataset that can be efficiently adapted or finetuned for a new pair of cameras using a small number of labeled images. Expand
Learning Monocular 3D Human Pose Estimation from Multi-view Images
This paper trains the system to predict the same pose in all views, and proposes a method to estimate camera pose jointly with human pose, which lets us utilize multiview footage where calibration is difficult, e.g., for pan-tilt or moving handheld cameras. Expand
3D Human Pose Estimation in the Wild by Adversarial Learning
An adversarial learning framework is proposed, which distills the 3D human pose structures learned from the fully annotated dataset to in-the-wild images with only 2D pose annotations and designs a geometric descriptor, which computes the pairwise relative locations and distances between body joints, as a new information source for the discriminator. Expand
Ordinal Depth Supervision for 3D Human Pose Estimation
This work proposes to use a weaker supervision signal provided by the ordinal depths of human joints, which achieves new state-of-the-art performance for the relevant benchmarks and validate the effectiveness of ordinal depth supervision for 3D human pose. Expand
Towards 3D Human Pose Estimation in the Wild: A Weakly-Supervised Approach
A weakly-supervised transfer learning method that uses mixed 2D and 3D labels in a unified deep neutral network that presents two-stage cascaded structure to regularize the 3D pose prediction, which is effective in the absence of ground truth depth labels. Expand
PifPaf: Composite Fields for Human Pose Estimation
The new PifPaf method, which uses a Part Intensity Field to localize body parts and a Part Association Field to associate body parts with each other to form full human poses, outperforms previous methods at low resolution and in crowded, cluttered and occluded scenes. Expand
Fusing Visual and Inertial Sensors with Semantics for 3D Human Pose Estimation
A multi-channel 3D convolutional neural network is used to learn a pose embedding from visual occupancy and semantic 2D pose estimates from the MVV in a discretised volumetric probabilistic visual hull, yielding improved accuracy over prior methods. Expand
2D Human Pose Estimation: New Benchmark and State of the Art Analysis
A novel benchmark "MPII Human Pose" is introduced that makes a significant advance in terms of diversity and difficulty, a contribution that is required for future developments in human body models. Expand
Total Capture: 3D Human Pose Estimation Fusing Video and Inertial Sensors
An algorithm for fusing multi-viewpoint video (MVV) with inertial measurement unit (IMU) sensor data to accurately estimate 3D human pose is presented, yielding improved accuracy over prior methods. Expand
Occlusion-Aware Networks for 3D Human Pose Estimation in Video
This work introduces an occlusion-aware deep-learning framework that outperforms state-of-the-art methods on Human 3.6M and HumanEva-I datasets and introduces a ``Cylinder Man Model'' to approximate the occupation of body parts in 3D space. Expand