Corpus ID: 226965638

Recovering and Simulating Pedestrians in the Wild

  title={Recovering and Simulating Pedestrians in the Wild},
  author={Ze Yang and Sivabalan Manivasagam and Ming Liang and Binh Yang and Wei-Chiu Ma and Raquel Urtasun},
Sensor simulation is a key component for testing the performance of self-driving vehicles and for data augmentation to better train perception systems. Typical approaches rely on artists to create both 3D assets and their animations to generate a new scenario. This, however, does not scale. In contrast, we propose to recover the shape and motion of pedestrians from sensor readings captured in the wild by a self-driving car driving around. Towards this goal, we formulate the problem as energy… Expand

Figures and Tables from this paper

Towards Optimal Strategies for Training Self-Driving Perception Models in Simulation
This work focuses on the use of labels in the synthetic domain alone, and introduces both a principled way to learn neural-invariant representations and a theoretically inspired view on how to sample the data from the simulator. Expand
S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
This work represents the pedestrian’s shape, pose and skinning weights as neural implicit functions that are directly learned from data, allowing it to handle a wide variety of different pedestrian shapes and poses without explicitly fitting a human parametric body model. Expand
Physics-based Human Motion Estimation and Synthesis from Videos
This work proposes a framework for training generative models of physically plausible human motion directly from monocular RGB videos, which are much more widely available, and achieves both qualitatively and quantitatively significantly improved motion estimation, synthesis quality and physical plausibility. Expand
Explainability of vision-based autonomous driving systems: Review and challenges
This survey reviews explainability methods for vision-based self-driving systems and discusses definitions, context, and motivation for gaining more interpretability and explainability from self- driving systems. Expand


LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World
This work develops a novel simulator that captures both the power of physics-based and learning-based simulation, and showcases LiDARsim's usefulness for perception algorithms-testing on long-tail events and end-to-end closed-loop evaluation on safety-critical scenarios. Expand
Recovering Accurate 3D Human Pose in the Wild Using IMUs and a Moving Camera
This work proposes a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild and obtains an accuracy of 26 mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in theWild. Expand
Exploiting Temporal Context for 3D Human Pose Estimation in the Wild
A bundle-adjustment-based algorithm for recovering accurate 3D human pose and meshes from monocular videos and shows that retraining a single-frame 3D pose estimator on this data improves accuracy on both real-world and mocap data by evaluating on the 3DPW and HumanEVA datasets. Expand
Detailed Human Shape and Pose from Images
This work represents the body using a recently proposed triangulated mesh model called SCAPE which employs a low-dimensional, but detailed, parametric model of shape and pose-dependent deformations that is learned from a database of range scans of human bodies. Expand
Markerless Outdoor Human Motion Capture Using Multiple Autonomous Micro Aerial Vehicles
This work describes the first fully autonomous outdoor capture system based on flying vehicles, which combines multiple state-of-the-art 2D joint detectors with a 3D human body model and a powerful prior on human pose to robustly fit the 2D measurements. Expand
VIBE: Video Inference for Human Body Pose and Shape Estimation
This work defines a novel temporal network architecture with a self-attention mechanism and shows that adversarial training, at the sequence level, produces kinematically plausible motion sequences without in-the-wild ground-truth 3D labels. Expand
End-to-End Recovery of Human Shape and Pose
This work introduces an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes, and produces a richer and more useful mesh representation that is parameterized by shape and 3D joint angles. Expand
Learning to Reconstruct 3D Human Pose and Shape via Model-Fitting in the Loop
The core of the proposed approach SPIN (SMPL oPtimization IN the loop) is that the two paradigms can form a strong collaboration, and better network estimates can lead the optimization to better solutions, while more accurate optimization fits provide better supervision for the network. Expand
PIXOR: Real-time 3D Object Detection from Point Clouds
PIXOR is proposed, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions that surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at 10 FPS. Expand
Human3.6M: Large Scale Datasets and Predictive Methods for 3D Human Sensing in Natural Environments
We introduce a new dataset, Human3.6M, of 3.6 Million accurate 3D Human poses, acquired by recording the performance of 5 female and 6 male subjects, under 4 different viewpoints, for trainingExpand