Perceiving Humans: from Monocular 3D Localization to Social Distancing

@article{Bertoni2021PerceivingHF,
  title={Perceiving Humans: from Monocular 3D Localization to Social Distancing},
  author={Lorenzo Bertoni and Sven Kreiss and Alexandre Alahi},
  journal={ArXiv},
  year={2021},
  volume={abs/2009.00984}
}
Perceiving humans in the context of Intelligent Transportation Systems (ITS) often relies on multiple cameras or expensive LiDAR sensors. In this work, we present a new cost-effective vision-based method that perceives humans' locations in 3D and their body orientation from a single image. We address the challenges related to the ill-posed monocular 3D tasks by proposing a deep learning method that predicts confidence intervals in contrast to point estimates. Our neural network architecture… 
Feedback-efficient Active Preference Learning for Socially Aware Robot Navigation
TLDR
This paper proposes a feedback-efficient active preference learning approach, FAPL, that distills human comfort and expectation into a reward model to guide the robot agent to explore latent aspects of social compliance.
Non-Pharmaceutical Interventions against COVID-19 Pandemic: Review of Contact Tracing and Social Distancing Technologies, Protocols, Apps, Security and Open Research Directions
TLDR
The paper critically and comprehensively reviews contact tracing technologies, protocols, and mobile applications that were recently developed and deployed against the coronavirus disease and highlights different security/privacy vulnerabilities identified in contact tracing and social distancing technologies and solutions against these vulnerabilities.
Automatic Social Distance Estimation From Images: Performance Evaluation, Test Benchmark, and Algorithm
TLDR
This paper provides a dataset of varying images with measured pair-wise social distances under different camera positionings and focal length values and proposes a method for automatic social distance estimation that takes advantage of object detection and human pose estimation.
Keypoint Communities
TLDR
A fast bottom-up method that jointly detects over 100 keypoints on humans or objects, also referred to as human/object pose estimation, and uses a graph centrality measure to assign training weights to different parts of a pose.
Monocular Pedestrian 3D Localization for Social Distance Monitoring
TLDR
An innovative pedestrian 3D localization method using monocular images combined with terrestrial point clouds that uses simple and efficient calculations, obtains accurate location, and can be used to implement social distancing rules.

References

SHOWING 1-10 OF 123 REFERENCES
MonoLoco: Monocular 3D Pedestrian Localization and Uncertainty Estimation
TLDR
The architecture is a light-weight feed-forward neural network that predicts 3D locations and corresponding confidence intervals given 2D human poses that is particularly well suited for small training data, cross-dataset generalization, and real-time applications.
MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization
TLDR
This work proposes a novel IDE method that directly predicts the depth of the targeting 3D bounding box's center using sparse supervision, and demonstrates that MonoGRNet achieves state-of-the-art performance on challenging datasets.
Towards social interaction detection in egocentric photo-streams
TLDR
This work states the presence or absence of interaction with the camera wearer and specifies which people are more involved in the interaction, by estimating pair-to-pair interaction probabilities over the sequence.
Deep Network for the Integrated 3D Sensing of Multiple People in Natural Images
TLDR
A multi-task deep neural network with differentiable stages where the person grouping problem is formulated as an integer program based on learned body part scores parameterized by both 2d and 3d information.
Recognizing proxemics in personal photos
TLDR
This work presents a computational formulation of visual proxemics by attempting to label each pair of people in an image with a subset of physically based “touch codes” by building an articulated model tuned for each touch code.
PifPaf: Composite Fields for Human Pose Estimation
TLDR
The new PifPaf method, which uses a Part Intensity Field to localize body parts and a Part Association Field to associate body parts with each other to form full human poses, outperforms previous methods at low resolution and in crowded, cluttered and occluded scenes.
A Simple Yet Effective Baseline for 3d Human Pose Estimation
TLDR
The results indicate that a large portion of the error of modern deep 3d pose estimation systems stems from their visual analysis, and suggests directions to further advance the state of the art in 3d human pose estimation.
3D Human Pose Estimation from a Single Image via Distance Matrix Regression
  • F. Moreno-Noguer
  • Computer Science
    2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2017
TLDR
It is shown that more precise pose estimates can be obtained by representing both the 2D and 3D human poses using NxN distance matrices, and formulating the problem as a 2D-to-3D distance matrix regression.
Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction
TLDR
MonopolyPSR, a monocular 3D object detection method that leverages proposals and shape reconstruction, is presented and a novel projection alignment loss is devised to jointly optimize these tasks in the neural network to improve 3D localization accuracy.
Social Relation Recognition in Egocentric Photostreams
TLDR
This paper proposes an approach to automatically categorize the social interactions of a user wearing a photo-camera, by relying solely on what the camera is seeing, that exploits the hierarchical structure of the label space and relies on a set of social attributes estimated at frame level to provide a semantic representation of social interactions.
...
1
2
3
4
5
...