Multiple Object Forecasting: Predicting Future Object Locations in Diverse Environments

@article{Styles2020MultipleOF,
  title={Multiple Object Forecasting: Predicting Future Object Locations in Diverse Environments},
  author={Olly Styles and Tanaya Guha and Victor Sanchez},
  journal={2020 IEEE Winter Conference on Applications of Computer Vision (WACV)},
  year={2020},
  pages={679-688}
}
This paper introduces the problem of multiple object forecasting (MOF), in which the goal is to predict future bounding boxes of tracked objects. In contrast to existing works on object trajectory forecasting which primarily consider the problem from a birds-eye perspective, we formulate the problem from an object-level perspective and call for the prediction of full object bounding boxes, rather than trajectories alone. Towards solving this task, we introduce the Citywalks dataset, which… Expand
Joint Learning Architecture for Multiple Object Tracking and Trajectory Forecasting
TLDR
Evaluations result show that JLA performs better for short term motion prediction and reduces ID switches by 33%, 31%, and 47% in the MOT16, MOT17, and MOT20 datasets, respectively, in comparison to FairMOT. Expand
Multi-Camera Trajectory Forecasting with Trajectory Tensors
TLDR
An MCTF framework that simultaneously uses all estimated relative object locations from several viewpoints and predicts the object’s future location in all possible viewpoints is developed and the concept of trajectory tensors is proposed: a new technique to encode trajectories across multiple camera views and the associated uncertainties. Expand
Multi-Camera Trajectory Forecasting: Pedestrian Trajectory Prediction in a Network of Cameras
TLDR
This work is the first to consider the challenging scenario of forecasting across multiple non-overlapping camera views, and considers the task of predicting the next camera a pedestrian will reappear after leaving the view of another camera. Expand
Multimodal Transformer Networks for Pedestrian Trajectory Prediction
  • Ziyi Yin, Ruijin Liu, Zhiliang Xiong, Zejian Yuan
  • Computer Science
  • IJCAI
  • 2021
TLDR
An efficient multimodal transformer network that aggregates the trajectory and ego-vehicle speed variations at a coarse granularity and that interacts with the optical flow in a fine-grained level to fill the vacancy of highly dynamic motion is proposed. Expand
Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior
TLDR
Experiments show that the reachability prior combined with multi-hypotheses learning improves multimodal prediction of the future location of tracked objects and, for the first time, the emergence of new objects. Expand
Joint Analysis and Prediction of Human Actions and Paths in Video
TLDR
This thesis conducts human action analysis and jointly optimize models for action detection, prediction and trajectory prediction, and aims to improve the performance and generalization ability of future trajectory and action prediction models. Expand
A Two-Block RNN-Based Trajectory Prediction From Incomplete Trajectory
TLDR
This paper introduces a two-block RNN model that approximates the inference steps of the Bayesian filtering framework and seeks the optimal estimation of the hidden state when miss-detection occurs, and uses two RNNs depending on the detection result. Expand
Simple means Faster: Real-Time Human Motion Forecasting in Monocular First Person Videos on CPU
  • J. Ansari, B. Bhowmick
  • Computer Science
  • 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
  • 2020
TLDR
This work is the first to accurately forecast trajectories at a very high prediction rate of 78 trajectories per second on CPU and demonstrates that having an auto-encoder in the encoding phase of the past information and a regularizing layer in the end boosts the accuracy of predictions with negligible overhead. Expand
Pedestrian Intention Prediction: A Multi-task Perspective
TLDR
This work tries to solve the problem of forecasting pedestrians' intentions sufficiently in advance by jointly predicting the intention and visual states of pedestrians by using a recurrent neural network in a multi-task learning approach. Expand
SimAug: Learning Robust Representations from Simulation for Trajectory Prediction
TLDR
A novel approach to learn robust representation through augmenting the simulation training data such that the representation can better generalize to unseen real-world test data. Expand
...
1
2
...

References

SHOWING 1-10 OF 51 REFERENCES
Tracking by Prediction: A Deep Generative Model for Mutli-person Localisation and Tracking
TLDR
This work introduces a light weight sequential Generative Adversarial Network architecture for person localisation, which overcomes issues related to occlusions and noisy detections, typically found in a multi person environment and proposes a novel data association scheme based on predicted trajectories. Expand
EuroCity Persons: A Novel Benchmark for Person Detection in Traffic Scenes
TLDR
The EuroCity Persons dataset is introduced, which provides a large number of highly diverse, accurate and detailed annotations of pedestrians, cyclists and other riders in urban traffic scenes, which is nearly one order of magnitude larger than datasets used previously for person detection in traffic scenes. Expand
Long-Term On-board Prediction of People in Traffic Scenes Under Uncertainty
TLDR
It is argued that it is necessary to predict at least 1 second and a new model is proposed that jointly predicts ego motion and people trajectories over such large time horizons and it is shown that both sequence modeling of trajectories as well as the novel method of long term odometry prediction are essential for best performance. Expand
Future Person Localization in First-Person Videos
TLDR
A new task that predicts future locations of people observed in first-person videos by incorporating a prediction framework with a multi-stream convolution-deconvolution architecture that is effective on a new dataset as well as on a public social interaction dataset. Expand
You'll never walk alone: Modeling social behavior for multi-target tracking
TLDR
A model of dynamic social behavior, inspired by models developed for crowd simulation, is introduced, trained with videos recorded from birds-eye view at busy locations, and applied as a motion model for multi-people tracking from a vehicle-mounted camera. Expand
The EuroCity Persons Dataset: A Novel Benchmark for Object Detection
TLDR
The EuroCity Persons dataset is introduced, which provides a large number of highly diverse, accurate and detailed annotations of pedestrians, cyclists and other riders in urban traffic scenes, which is nearly one order of magnitude larger than person datasets used previously for benchmarking. Expand
Forecasting Pedestrian Trajectory with Machine-Annotated Training Data
TLDR
This work addresses the lack of training data by introducing a scalable machine annotation scheme that enables the model to be trained using a large dataset without human annotation, and proposes Dynamic Trajectory Predictor (DTP), a model for forecasting pedestrian trajectory up to one second into the future. Expand
Social LSTM: Human Trajectory Prediction in Crowded Spaces
TLDR
This work proposes an LSTM model which can learn general human movement and predict their future trajectories and outperforms state-of-the-art methods on some of these datasets. Expand
Egocentric Vision-based Future Vehicle Localization for Intelligent Driving Assistance Systems
TLDR
A novel approach to simultaneously predict both the location and scale of target vehicles in the first-person (egocentric) view of an ego-vehicle using a multi-stream recurrent neural network encoder-decoder model that separately captures both object location and Scale and pixel-level observations for future vehicle localization is introduced. Expand
Real-Time Multiple People Tracking with Deeply Learned Candidate Selection and Person Re-Identification
TLDR
This paper proposes to handle unreliable detection by collecting candidates from outputs of both detection and tracking, and adopts a deeply learned appearance representation, which is trained on large-scale person re-identification datasets, to improve the identification ability of the tracker. Expand
...
1
2
3
4
5
...