Narrowing the coordinate-frame gap in behavior prediction models: Distillation for efficient and accurate scene-centric motion forecasting

@article{Su2022NarrowingTC,
  title={Narrowing the coordinate-frame gap in behavior prediction models: Distillation for efficient and accurate scene-centric motion forecasting},
  author={DiJia Su and Bertrand Douillard and Rami Al-Rfou and Cheol Soo Park and Benjamin Sapp},
  journal={2022 International Conference on Robotics and Automation (ICRA)},
  year={2022},
  pages={653-659}
}
Behavior prediction models have proliferated in recent years, especially in the popular real-world robotics application of autonomous driving, where representing the distribution over possible futures of moving agents is essential for safe and comfortable motion planning. In these models, the choice of coordinate frames to represent inputs and outputs has crucial trade offs which broadly fall into one of two categories. Agent-centric models transform inputs and perform inference in agent… 

Figures and Tables from this paper

References

SHOWING 1-10 OF 41 REFERENCES

Scene Transformer: A unified multi-task model for behavior prediction and planning

This work demonstrates that formulating the problem of behavior prediction in a unified architecture with a masking strategy may allow us to have a single model that can perform multiple motion prediction and planning related tasks effectively.

LaneRCNN: Distributed Representations for Graph-Centric Motion Forecasting

LaneRCNN is proposed, a graph-centric motion forecasting model that captures the actor-to-actor and theActor- to-map relations in a distributed and structured manner and parameterize the output trajectories based on lane graphs, a more amenable prediction parameterization.

Trajectron++: Dynamically-Feasible Trajectory Forecasting with Heterogeneous Data

Trajectron++ is a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents while incorporating agent dynamics and heterogeneous data and outperforming a wide array of state-of-the-art deterministic and generative methods.

Trajectron++: Multi-Agent Generative Trajectory Forecasting With Heterogeneous Data for Control

Trajectron++ is a modular, graph-structured recurrent model that forecasts the trajectories of a general number of agents with distinct semantic classes while incorporating heterogeneous data (e.g. semantic maps and camera images) and is designed to be tightly integrated with robotic planning and control frameworks.

PRECOG: PREdiction Conditioned on Goals in Visual Multi-Agent Settings

A probabilistic forecasting model of future interactions between a variable number of agents that performs both standard forecasting and the novel task of conditional forecasting, which reasons about how all agents will likely respond to the goal of a controlled agent.

DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

The proposed Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes significantly improves the prediction accuracy compared to other baseline methods.

Multiple Futures Prediction

A probabilistic framework that efficiently learns latent variables to jointly model the multi-step future motions of agents in a scene and can be used for planning via computing a conditional probability density over the trajectories of other agents given a hypothetical rollout of the ego agent.

Large Scale Interactive Motion Forecasting for Autonomous Driving : The Waymo Open Motion Dataset

This work introduces the most diverse interactive motion dataset to their knowledge, and provides specific labels for interacting objects suitable for developing joint prediction models, in a bid to provide strong baseline models for individual-agent prediction and joint-prediction.

What-If Motion Prediction for Autonomous Driving

This work proposes a recurrent graph-based attentional approach with interpretable geometric and social relationships that supports the injection of counterfactual geometric goals and social contexts that could be used in the planning loop to reason about unobserved causes or unlikely futures that are directly relevant to the AV's intended route.

Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions

A unified representation is presented which encodes such high-level semantic information in a spatial grid, allowing the use of deep convolutional models to fuse complex scene context and empirically show that one can effectively learn fundamentals of driving behavior.