• Corpus ID: 49314531

Bayesian Prediction of Future Street Scenes through Importance Sampling based Optimization

  title={Bayesian Prediction of Future Street Scenes through Importance Sampling based Optimization},
  author={Apratim Bhattacharyya and Mario Fritz and Bernt Schiele},
For autonomous agents to successfully operate in the real world, anticipation of future events and states of their environment is a key competence. This problem can be formalized as a sequence prediction problem, where a number of observations are used to predict the sequence into the future. However, real-world scenarios demand a model of uncertainty of such predictions, as future states become increasingly uncertain and multi-modal -- in particular on long time horizons. This makes modelling… 

Figures and Tables from this paper

Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior

Experiments show that the reachability prior combined with multi-hypotheses learning improves multimodal prediction of the future location of tracked objects and, for the first time, the emergence of new objects.

Game Plan: What AI can do for Football, and What Football can do for AI

An overarching perspective is provided highlighting how the combination of these fields, in particular, forms a unique microcosm for AI research, while offering mutual benefits for professional teams, spectators, and broadcasters in the years to come.



DESIRE: Distant Future Prediction in Dynamic Scenes with Interacting Agents

The proposed Deep Stochastic IOC RNN Encoder-decoder framework, DESIRE, for the task of future predictions of multiple interacting agents in dynamic scenes significantly improves the prediction accuracy compared to other baseline methods.

Stochastic Variational Video Prediction

This paper develops a stochastic variational video prediction (SV2P) method that predicts a different possible future for each sample of its latent variables, and is the first to provide effective Stochastic multi-frame prediction for real-world video.

Long-Term On-board Prediction of People in Traffic Scenes Under Uncertainty

It is argued that it is necessary to predict at least 1 second and a new model is proposed that jointly predicts ego motion and people trajectories over such large time horizons and it is shown that both sequence modeling of trajectories as well as the novel method of long term odometry prediction are essential for best performance.

Predicting Scene Parsing and Motion Dynamics in the Future

A novel model to simultaneously predict scene parsing and optical flow in unobserved future video frames is proposed and shows significantly better parsing and motion prediction results when compared to well-established baselines and individual prediction models on the large-scale Cityscapes dataset.

Predicting Deeper into the Future of Semantic Segmentation

An autoregressive convolutional neural network that learns to iteratively generate multiple frames is developed and results show that directly predicting future segmentations is substantially better than predicting and then segmenting future RGB frames.

What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?

A Bayesian deep learning framework combining input-dependent aleatoric uncertainty together with epistemic uncertainty is presented, which makes the loss more robust to noisy data, also giving new state-of-the-art results on segmentation and depth regression benchmarks.

Learning Structured Output Representation using Deep Conditional Generative Models

A deep conditional generative model for structured output prediction using Gaussian latent variables is developed, trained efficiently in the framework of stochastic gradient variational Bayes, and allows for fast prediction using Stochastic feed-forward inference.

Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks

A novel approach that models future frames in a probabilistic manner is proposed, namely a Cross Convolutional Network to aid in synthesizing future frames; this network structure encodes image and motion information as feature maps and convolutional kernels, respectively.

Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning

A new theoretical framework is developed casting dropout training in deep neural networks (NNs) as approximate Bayesian inference in deep Gaussian processes, which mitigates the problem of representing uncertainty in deep learning without sacrificing either computational complexity or test accuracy.

Stochastic Video Generation with a Learned Prior

An unsupervised video generation model that learns a prior model of uncertainty in a given environment and generates video frames by drawing samples from this prior and combining them with a deterministic estimate of the future frame.