Social-STAGE: Spatio-Temporal Multi-Modal Future Trajectory Forecast

@article{Malla2020SocialSTAGESM,
  title={Social-STAGE: Spatio-Temporal Multi-Modal Future Trajectory Forecast},
  author={Srikanth Malla and Chiho Choi and Behzad Dariush},
  journal={2021 IEEE International Conference on Robotics and Automation (ICRA)},
  year={2020},
  pages={13938-13944}
}
This paper considers the problem of multi-modal future trajectory forecast with ranking. Here, multi-modality and ranking refer to the multiple plausible path predictions and the confidence in those predictions, respectively. We propose Social-STAGE, Social interaction-aware Spatio-Temporal multi-Attention Graph convolution network with novel Evaluation for multi-modality. Our main contributions include analysis and formulation of multi-modality with ranking using interaction and multi… 

Figures and Tables from this paper

Skeleton-Graph: Long-Term 3D Motion Prediction From 2D Observations Using Deep Spatio-Temporal Graph CNNs

Skeleton-Graph is proposed, a deep spatio-temporal graph CNN model that predicts the future 3D skeleton poses in a single pass from the 2D ones and introduces a new metric that measures the divergence of predictions on the long-term.

References

SHOWING 1-10 OF 29 REFERENCES

STGAT: Modeling Spatial-Temporal Interactions for Human Trajectory Prediction

This work proposes a Spatial-Temporal Graph Attention network (STGAT), based on a sequence-to-sequence architecture to predict future trajectories of pedestrians, which achieves superior performance on two publicly available crowd datasets and produces more "socially" plausible trajectories for pedestrians.

Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction

The Social Spatio-Temporal Graph Convolutional Neural Network (Social-STGCNN), which substitutes the need of aggregation methods by modeling the interactions as a graph, and is data efficient, and exceeds previous state of the art on the ADE metric with only 20% of the training data.

Social-BiGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks

A graph-based generative adversarial network that generates realistic, multimodal trajectory predictions by better modelling the social interactions of pedestrians in a scene and achieves state-of-the-art performance comparing it to several baselines on existing trajectory forecasting benchmarks.

SR-LSTM: State Refinement for LSTM Towards Pedestrian Trajectory Prediction

A data-driven state refinement module for LSTM network (SR-LSTM), which activates the utilization of the current intention of neighbors, and jointly and iteratively refines the current states of all participants in the crowd through a message passing mechanism is proposed.

Social LSTM: Human Trajectory Prediction in Crowded Spaces

This work proposes an LSTM model which can learn general human movement and predict their future trajectories and outperforms state-of-the-art methods on some of these datasets.

Shared Cross-Modal Trajectory Prediction for Autonomous Driving

  • Chiho Choi
  • Computer Science
    2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2021
A Cross-Modal Embedding framework that aims to benefit from the use of multiple input modalities and learns to embed a set of complementary features in a shared latent space by jointly optimizing the objective functions across different types of input data.

The Trajectron: Probabilistic Multi-Agent Trajectory Modeling With Dynamic Spatiotemporal Graphs

  • B. IvanovicM. Pavone
  • Computer Science
    2019 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2019
The Trajectron is presented, a graph-structured model that predicts many potential future trajectories of multiple agents simultaneously in both highly dynamic and multimodal scenarios (i.e. where the number of agents in the scene is time-varying and there are many possible highly-distinct futures for each agent).

DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

The proposed Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes significantly improves the prediction accuracy compared to other baseline methods.

TrafficPredict: Trajectory Prediction for Heterogeneous Traffic-Agents

The proposed long short-term memory-based (LSTM-based) realtime traffic prediction algorithm, TrafficPredict, uses an instance layer to learn instances' movements and interactions and has a category layer to learning the similarities of instances belonging to the same type to refine the prediction.

Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction

This work presents an approach that involves the prediction of several samples of the future with a winner-takes-all loss and iterative grouping of samples to multiple modes and shows on synthetic and real data that the proposed approach triggers good estimates of multimodal distributions and avoids mode collapse.