• Publications
  • Influence
Perceptual Losses for Real-Time Style Transfer and Super-Resolution
TLDR
This work considers image transformation problems, and proposes the use of perceptual loss functions for training feed-forward networks for image transformation tasks, and shows results on image style transfer, where aFeed-forward network is trained to solve the optimization problem proposed by Gatys et al. in real-time.
Social LSTM: Human Trajectory Prediction in Crowded Spaces
TLDR
This work proposes an LSTM model which can learn general human movement and predict their future trajectories and outperforms state-of-the-art methods on some of these datasets.
FREAK: Fast Retina Keypoint
TLDR
This work proposes a novel keypoint descriptor inspired by the human visual system and more precisely the retina, coined Fast Retina Keypoint (FREAK), which is in general faster to compute with lower memory load and also more robust than SIFT, SURF or BRISK.
Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks
TLDR
A recurrent sequence-to-sequence model observes motion histories and predicts future behavior, using a novel pooling mechanism to aggregate information across people, and outperforms prior work in terms of accuracy, variety, collision avoidance, and computational complexity.
Learning Social Etiquette: Human Trajectory Understanding In Crowded Scenes
TLDR
This paper contributes a new large-scale dataset that collects videos of various types of targets that navigate in a real world outdoor environment such as a university campus and introduces a new characterization that describes the “social sensitivity” at which two targets interact.
Learning to Track: Online Multi-object Tracking by Decision Making
TLDR
This work forms the online MOT problem as decision making in Markov Decision Processes (MDPs), where the lifetime of an object is modeled with a MDP, and a similarity function for data association is equivalent to learning a policy for the MDP.
Tracking the Untrackable: Learning to Track Multiple Cues with Long-Term Dependencies
TLDR
The robustness of the data-driven approach is demonstrated by tracking multiple targets using their appearance, motion, and even interactions and it outperforms previous works on multiple publicly available datasets including the challenging MOT benchmark.
Crowd-Robot Interaction: Crowd-Aware Robot Navigation With Attention-Based Deep Reinforcement Learning
TLDR
This work proposes to rethink pairwise interactions with a self-attention mechanism, and jointly model Human-Robot as well as Human-Human interactions in the deep reinforcement learning framework, and captures the Human- human interactions occurring in dense crowds that indirectly affects the robot’s anticipation capability.
PifPaf: Composite Fields for Human Pose Estimation
TLDR
The new PifPaf method, which uses a Part Intensity Field to localize body parts and a Part Association Field to associate body parts with each other to form full human poses, outperforms previous methods at low resolution and in crowded, cluttered and occluded scenes.
Towards Viewpoint Invariant 3D Human Pose Estimation
TLDR
A viewpoint invariant model for 3D human pose estimation from a single depth image that leverages a convolutional and recurrent network architecture with a top-down error feedback mechanism to self-correct previous pose estimates in an end-to-end manner.
...
1
2
3
4
5
...