TITAN: Future Forecast Using Action Priors

  title={TITAN: Future Forecast Using Action Priors},
  author={Srikanth Malla and Behzad Dariush and Chiho Choi},
  journal={2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
We consider the problem of predicting the future trajectory of scene agents from egocentric views obtained from a moving platform. This problem is important in a variety of domains, particularly for autonomous systems making reactive or strategic decisions in navigation. In an attempt to address this problem, we introduce TITAN (Trajectory Inference using Targeted Action priors Network), a new model that incorporates prior positions, actions, and context to forecast future trajectory of agents… 

Figures and Tables from this paper

Shared Cross-Modal Trajectory Prediction for Autonomous Driving

  • Chiho Choi
  • Computer Science
    2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
  • 2021
A Cross-Modal Embedding framework that aims to benefit from the use of multiple input modalities and learns to embed a set of complementary features in a shared latent space by jointly optimizing the objective functions across different types of input data.

SSP: Single Shot Future Trajectory Prediction

A robust solution to future trajectory forecast, which can be practically applicable to autonomous agents in highly crowded environments and validated using the ETH, UCY, and SDD datasets and highlighted its practical functionality compared to the current state-of-the-art methods.

Multiple Trajectory Prediction of Moving Agents with Memory Augmented Networks.

This paper proposes MANTRA, a model that exploits memory augmented networks to effectively predict multiple trajectories of other agents, observed from an egocentric perspective, and shows how once trained the system can continuously improve by ingesting novel patterns.

PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction in 3D

A new pedestrian action prediction dataset is proposed created by adding per-frame 2D/3D bounding box and behavioral annotations to the popular autonomous driving dataset, nuScenes, and a hybrid neural network architecture that incorporates various data modalities for predicting pedestrian crossing action is proposed.

EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning

This paper proposes a generic trajectory forecasting framework with explicit relational structure recognition and prediction via latent interaction graphs among multiple heterogeneous, interactive agents and introduces a double-stage training pipeline which not only improves training efficiency and accelerates convergence, but also enhances model performance.

DROGON: A Trajectory Prediction Model based on Intention-Conditioned Behavior Reasoning

The proposed framework for accurate vehicle trajectory prediction by considering behavioral intentions of vehicles in traffic scenes is extended to the pedestrian trajectory prediction task, showing the potential applicability toward general trajectory prediction.

Semantic Synthesis of Pedestrian Locomotion

This work reformulate pedestrian trajectory forecasting as a structured reinforcement learning (RL) problem, and proposes a hierarchical model consisting of a semantic trajectory policy network that provides a distribution over possible movements, and a human locomotion network that generates 3d human poses in each step.

A Two-Block RNN-Based Trajectory Prediction From Incomplete Trajectory

This paper introduces a two-block RNN model that approximates the inference steps of the Bayesian filtering framework and seeks the optimal estimation of the hidden state when miss-detection occurs, and uses two RNNs depending on the detection result.

Recognition and 3D Localization of Pedestrian Actions from Monocular Video

  • Jun HayakawaBehzad Dariush
  • Computer Science
    2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC)
  • 2020
This paper proposes an action recognition framework using a two-stream temporal relation network with inputs corresponding to the raw RGB image sequence of the tracked pedestrian as well as the pedestrian pose, and proposes a 3D localization method that outperforms existing state-of-the-art methods.

Detecting 32 Pedestrian Attributes for Autonomous Vehicles

This paper addresses the problem of jointly detecting pedestrians and recognizing 32 pedestrian attributes from a single image, and introduces a Multi-Task Learning (MTL) model relying on a composite field framework, which achieves both goals in an efficient way.



NEMO: Future Object Localization Using Noisy Ego Priors

  • Srikanth MallaChiho Choi
  • Computer Science
    2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC)
  • 2022
This paper extensively evaluates the proposed NEMO (Noisy Ego MOtion priors for future object localization) framework using the publicly available benchmark dataset (HEV-I) supplemented with odometry data from an Inertial Measurement Unit (IMU).

DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

The proposed Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes significantly improves the prediction accuracy compared to other baseline methods.

DROGON: A Causal Reasoning Framework for Future Trajectory Forecast

This work proposes DROGON (Deep RObust Goal-Oriented trajectory prediction Network) for accurate vehicle trajectory forecast by considering behavioral intention of vehicles in traffic scenes, and builds a conditional prediction model to forecast goal-oriented trajectories.

Egocentric Vision-based Future Vehicle Localization for Intelligent Driving Assistance Systems

A novel approach to simultaneously predict both the location and scale of target vehicles in the first-person (egocentric) view of an ego-vehicle using a multi-stream recurrent neural network encoder-decoder model that separately captures both object location and Scale and pixel-level observations for future vehicle localization is introduced.

Long-Term On-board Prediction of People in Traffic Scenes Under Uncertainty

It is argued that it is necessary to predict at least 1 second and a new model is proposed that jointly predicts ego motion and people trajectories over such large time horizons and it is shown that both sequence modeling of trajectories as well as the novel method of long term odometry prediction are essential for best performance.

Conditional Generative Neural System for Probabilistic Trajectory Prediction

This paper proposes a conditional generative neural system (CGNS) for probabilistic trajectory prediction to approximate the data distribution, with which realistic, feasible and diverse future trajectory hypotheses can be sampled.

Social LSTM: Human Trajectory Prediction in Crowded Spaces

This work proposes an LSTM model which can learn general human movement and predict their future trajectories and outperforms state-of-the-art methods on some of these datasets.

Looking to Relations for Future Trajectory Forecast

  • Chiho ChoiB. Dariush
  • Computer Science
    2019 IEEE/CVF International Conference on Computer Vision (ICCV)
  • 2019
A relation-aware framework for future trajectory forecast that constructs pair-wise relations from spatio-temporal interactions and identifies more descriptive relations that highly influence future motion of the target road user by considering its past trajectory is proposed.

Interaction-aware Multi-agent Tracking and Probabilistic Behavior Prediction via Adversarial Learning

This work takes advantage of the Generative Adversarial Network due to its capability of distribution learning and proposes a generic multi-agent probabilistic prediction and tracking framework which takes the interactions among multiple entities into account, in which all the entities are treated as a whole.

Predicting Action Tubes

A Tube Prediction network (TPnet) is proposed which jointly predicts the past, present and future bounding boxes along with their action classification scores, and the fact that TPnet improves state-of-the-art detection performance, on one of the standard action detection benchmarks - J-HMDB-21 dataset.