• Corpus ID: 53109814

IntentNet: Learning to Predict Intention from Raw Sensor Data

@article{Casas2018IntentNetLT,
  title={IntentNet: Learning to Predict Intention from Raw Sensor Data},
  author={Sergio Casas and Wenjie Luo and Raquel Urtasun},
  journal={ArXiv},
  year={2018},
  volume={abs/2101.07907}
}
In order to plan a safe maneuver, self-driving vehicles need to understand the intent of other traffic participants. We define intent as a combination of discrete high-level behaviors as well as continuous trajectories describing future motion. In this paper, we develop a one-stage detector and forecaster that exploits both 3D point clouds produced by a LiDAR sensor as well as dynamic maps of the environment. Our multi-task model achieves better accuracy than the respective separate modules… 

Figures and Tables from this paper

Maneuver-based Trajectory Prediction for Self-driving Cars Using Spatio-temporal Convolutional Networks
TLDR
A new, datadriven approach for predicting the motion of vehicles in a road environment that jointly exploits correlations in the spatial and temporal domain using convolutional neural networks and is able to achieve a competitive prediction performance.
Goal-Directed Occupancy Prediction for Lane-Following Actors
TLDR
This work proposes a new method that leverages the mapped road topology to reason over possible goals and predict the future spatial occupancy of dynamic road actors and shows that it is able to accurately predict future occupancy that remains consistent with the mapped lane geometry and naturally captures multi-modality based on the local scene context.
End-To-End Interpretable Neural Motion Planner
TLDR
A holistic model that takes as input raw LIDAR data and a HD map and produces interpretable intermediate representations in the form of 3D detections and their future trajectories, as well as a cost volume defining the goodness of each position that the self-driving car can take within the planning horizon, is designed.
Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions
TLDR
A unified representation is presented which encodes such high-level semantic information in a spatial grid, allowing the use of deep convolutional models to fuse complex scene context and empirically show that one can effectively learn fundamentals of driving behavior.
MotionCNN: A Strong Baseline for Motion Prediction in Autonomous Driving
TLDR
This work presents a simple and yet very strong baseline for multimodal motion prediction based purely on Convolutional Neural Networks that achieves competitive performance compared to the state-of-the-art methods and ranks 3rd on the 2021 Waymo Open Dataset Motion Prediction Challenge.
MultiXNet: Multiclass Multistage Multimodal Motion Prediction
TLDR
The proposed MultiXNet approach for detection and motion prediction based directly on lidar sensor data builds on prior work by handling multiple classes of traffic actors, adding a jointly trained second-stage trajectory refinement step, and producing a multimodal probability distribution over future actor motion.
ParkPredict: Motion and Intent Prediction of Vehicles in Parking Lots
TLDR
The results show that intent can be estimated well and knowledge of the human driver's intended parking spot has a major impact on predicting parking trajectory, and the semantic representation of the environment improves long term predictions.
Autonomous Driving with Interpretable Goal Recognition and Monte Carlo Tree Search
TLDR
An integrated planning and prediction system which leverages the computational benefit of using a finite space of maneuvers, and extends the approach to planning and predictions of sequences of maneuvers via rational inverse planning to recognise the goals of other vehicles is proposed.
Exploring Map-based Features for Efficient Attention-based Vehicle Motion Prediction
TLDR
This work explores how to achieve competitive performance on the Argoverse 1.0 Benchmark using efficient attention-based models, which take as input the past trajectories and map-based features from minimal map information to ensure efficient and reliable MP.
PnPNet: End-to-End Perception and Prediction With Tracking in the Loop
TLDR
This work proposes PnPNet, an end-to-end model that takes as input sequential sensor data, and outputs at each time step object tracks and their future trajectories, and shows significant improvements over the state-of-the-art with better occlusion recovery and more accurate future prediction.
...
...

References

SHOWING 1-10 OF 38 REFERENCES
Motion Prediction of Traffic Actors for Autonomous Driving using Deep Convolutional Networks
TLDR
A deep learning-based approach is introduced that takes into account current state of traffic actors and produces rasterized representations of each actor's vicinity and is deployed to a fleet of autonomous vehicles.
Probabilistic Prediction of Vehicle Semantic Intention and Motion
TLDR
A Semantic based Intention and Motion Prediction (SIMP) method, which can be adapted to any driving scenarios by using semantic defined vehicle behaviors, and utilizes a probabilistic framework based on deep neural network to estimate the intentions, final locations, and the corresponding time information for surrounding vehicles.
Generalizable intention prediction of human drivers at intersections
TLDR
A new class of recurrent neural networks, Long Short Term Memory networks (LSTMs), are shown to outperform the state of the art in predicting whether a driver will turn left, turn right, or continue straight up to 150 m with consistent accuracy before reaching the intersection.
Understanding High-Level Semantics by Modeling Traffic Patterns
TLDR
A generative model of 3D urban scenes which is able to reason not only about the geometry and objects present in the scene, but also about the high-level semantics in the form of traffic patterns is proposed.
Dynamic Occupancy Grid Prediction for Urban Autonomous Driving: A Deep Learning Approach with Fully Automatic Labeling
TLDR
This contribution tackles the prediction of complex downtown scenarios with multiple road users, e.g., pedestrians, bikes, and motor vehicles, interacting with each other by combining a Bayesian filtering technique for environment representation, and machine learning as long-term predictor using a dynamic occupancy grid map as input to a deep convolutional neural network.
Car that Knows Before You Do: Anticipating Maneuvers via Learning Temporal Driving Models
TLDR
This work proposes an Autoregressive Input-Output HMM to model the contextual information alongwith the maneuvers in driving maneuvers and shows that it can anticipate maneuvers 3.5 seconds before they occur with over 80% F1-score in real-time.
PIXOR: Real-time 3D Object Detection from Point Clouds
TLDR
PIXOR is proposed, a proposal-free, single-stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions that surpasses other state-of-the-art methods notably in terms of Average Precision (AP), while still runs at 10 FPS.
DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents
TLDR
The proposed Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes significantly improves the prediction accuracy compared to other baseline methods.
Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net
TLDR
A novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor is proposed, which is very efficient in terms of both memory and computation.
Vehicle Detection from 3D Lidar Using Fully Convolutional Network
TLDR
This paper proposes to present the data in a 2D point map and use a single 2D end-to-end fully convolutional network to predict the objectness confidence and the bounding boxes simultaneously, and shows the state-of-the-art performance of the proposed method.
...
...