Context-Aware Human Motion Prediction

  title={Context-Aware Human Motion Prediction},
  author={Enric Corona and Albert Pumarola and G. Aleny{\`a} and Francesc Moreno-Noguer},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
The problem of predicting human motion given a sequence of past observations is at the core of many applications in robotics and computer vision. [] Key Method These interactions are iteratively learned through a graph attention layer, fed with the past observations, which now include both object and human body motions. Once this semantic graph is learned, we inject it to a standard RNN to predict future movements of the human/s and object/s.

Figures and Tables from this paper

Motion Prediction using Trajectory Cues

A semi-constrained graph is presented to explicitly encode skeletal connections and prior knowledge, while adaptively learn implicit dependencies between joints to model motion contexts in the joint trajectory space.

Multi-Person Extreme Motion Prediction

A novel cross interaction attention mechanism that exploits historical information of both persons, and learns to predict cross dependencies between the two pose sequences to predict the future motion of two interacted persons given two sequences of their past skeletons.

Multi-Person Extreme Motion Prediction with Cross-Interaction Attention

This paper captures ExPI (Extreme Pose Interaction), a new lab-based person interaction dataset of professional dancers performing acrobatics and devise a novel cross interaction attention mechanism that exploits historical information of both persons and learns to predict cross dependencies between self poses and the poses of the other person in spite of their spatial or temporal distance.

Dyadic Human Motion Prediction

This paper introduces a motion prediction framework that explicitly reasons about the interactions of two observed subjects and introduces a pairwise attention mechanism that models the mutual dependencies in the motion history of the two subjects.

Action-guided 3D Human Motion Prediction

An action-specific memory bank is constructed to store representative motion dynamics for each action category, and a query-read process is designed to retrieve some motion dynamics from the memory bank to guide motion prediction.

AGVNet: Attention Guided Velocity Learning for 3D Human Motion Prediction

A novel attention-guided velocity learning network, AGVNet, that utilizes multi-order information such as positions and velocities derived from the dynamic states of the human body for predicting human motion.

TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild

A novel TRajectory and POse Dynamics (nicknamed TRiPOD) method based on graph attentional networks to model the human-human and human-object interactions both in the input space and the output space (decoded future output).

Stochastic Scene-Aware Motion Prediction

This work presents a novel data-driven, stochastic motion synthesis method that models different styles of performing a given action with a target object and generalizes to target objects of various geometries while enabling the character to navigate in cluttered scenes.

Towards Accurate 3D Human Motion Prediction from Incomplete Observations

This work proposes a novel multi-task graph convolutional network (MTGCN), in which the primary task is to focus on forecasting future 3D human actions accurately, while the auxiliary one is to repair the missing value of the incomplete observation.

3D Human Motion Prediction: A Survey

A comprehensive survey on 3D human motion prediction is conducted for the purpose of retrospecting and analyzing relevant works from existing released literature, and a pertinent taxonomy is constructed to categorize existing approaches for 3Dhuman motion prediction.



On Human Motion Prediction Using Recurrent Neural Networks

It is shown that, surprisingly, state of the art performance can be achieved by a simple baseline that does not attempt to model motion at all, and a simple and scalable RNN architecture is proposed that obtains state-of-the-art performance on human motion prediction.

Learning Trajectory Dependencies for Human Motion Prediction

A simple feed-forward deep network for motion prediction, which takes into account both temporal smoothness and spatial dependencies among human body joints, and design a new graph convolutional network to learn graph connectivity automatically.

Structured Prediction Helps 3D Human Motion Modelling

A novel approach that decomposes the prediction into individual joints by means of a structured prediction layer that explicitly models the joint dependencies and increases the performance of motion forecasting irrespective of the base network, joint-angle representation, and prediction horizon.

Predicting 3D Human Dynamics From Video

This work presents perhaps the first approach for predicting a future 3D mesh model sequence of a person from past video input, and inspired by the success of autoregressive models in language modeling tasks, learns an intermediate latent space on which to predict the future.

BiHMP-GAN: Bidirectional 3D Human Motion Prediction GAN

Outcomes of both qualitative and quantitative evaluations, on the probabilistic generations of the model, demonstrate the superiority of BiHMP-GAN over previously available methods.

Human Motion Prediction via Spatio-Temporal Inpainting

This work argues that the L2 metric, considered so far by most approaches, fails to capture the actual distribution of long-term human motion, and proposes two alternative metrics, based on the distribution of frequencies, that are able to capture more realistic motion patterns.

Learning Human Motion Models for Long-Term Predictions

The Dropout Autoencoder LSTM (DAELSTM), a new architecture for the learning of predictive spatio-temporal motion models from data alone, is capable of synthesizing natural looking motion sequences over long-time horizons without catastrophic drift or motion degradation.

Anticipating Human Activities Using Object Affordances for Reactive Robotic Response

This work represents each possible future using an anticipatory temporal conditional random field (ATCRF) that models the rich spatial-temporal relations through object affordances and represents each ATCRF as a particle and represents the distribution over the potential futures using a set of particles.

Recurrent Network Models for Human Dynamics

The Encoder-Recurrent-Decoder (ERD) model is a recurrent neural network that incorporates nonlinear encoder and decoder networks before and after recurrent layers that extends previous Long Short Term Memory models in the literature to jointly learn representations and their dynamics.

Adversarial Geometry-Aware Human Motion Prediction

This work proposes a novel frame-wise geodesic loss as a geometrically meaningful, more precise distance measurement and presents a new learning procedure to simultaneously validate the sequence-level plausibility of the prediction and its coherence with the input sequence by introducing two global recurrent discriminators.