• Corpus ID: 52001838

Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models

@article{Neitz2018AdaptiveSI,
  title={Adaptive Skip Intervals: Temporal Abstraction for Recurrent Dynamical Models},
  author={Alexander Neitz and Giambattista Parascandolo and Stefan Bauer and Bernhard Sch{\"o}lkopf},
  journal={ArXiv},
  year={2018},
  volume={abs/1808.04768}
}
We introduce a method which enables a recurrent dynamics model to be temporally abstract. Our approach, which we call Adaptive Skip Intervals (ASI), is based on the observation that in many sequential prediction tasks, the exact time at which events occur is irrelevant to the underlying objective. Moreover, in many situations, there exist prediction intervals which result in particularly easy-to-predict transitions. We show that there are prediction tasks for which we gain both computational… 
Variational Temporal Abstraction
TLDR
The Variational Temporal Abstraction (VTA) is proposed, a hierarchical recurrent state space model that can infer the latent temporal structure and thus perform the stochastic state transition hierarchically and is applied to implement the jumpy imagination ability in imagination-augmented agent-learning in order to improve the efficiency of the imagination.
Learning Transition Models with Time-delayed Causal Relations
This paper introduces an algorithm for discovering implicit and delayed causal relations between events observed by a robot at arbitrary times, with the objective of improving data-efficiency and
Episodic Memory for Subjective-Timescale Models
Planning in complex environments requires reasoning over multi-step timescales. However, in model-based learning, an agent’s model is more commonly defined over transitions between consecutive
Variational Predictive Routing with Nested Subjective Timescales
TLDR
Variational Predictive Routing is presented – a neural probabilistic inference system that organizes latent representations of video features in a temporal hierarchy, based on their rates of change, thus modeling continuous data as a hierarchical renewal process.
Temporal Difference Variational Auto-Encoder
TLDR
TD-VAE is proposed, a generative sequence model that learns representations containing explicit beliefs about states several steps into the future, and that can be rolled out directly without single-step transitions.
Time-Agnostic Prediction: Predicting Predictable Video Frames
TLDR
This work decouple visual prediction from a rigid notion of time so that time-agnostic predictors (TAP) are not tied to specific times so that they may instead discover predictable "bottleneck" frames no matter when they occur.
Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs
We present two elegant solutions for modeling continuous-time dynamics, in a novel model-based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs), using neural ordinary
A teacher-student framework to distill future trajectories
TLDR
Instead of hand-designing how trajectories should be incorporated, a teacher network learns to extract relevant information from the trajectories and to distill it into target activations which guide a student model that can only observe the present.
A Dynamically Controlled Recurrent Neural Network for Modeling Dynamical Systems
TLDR
The proposed DCRNN includes learnable skip-connections across previously hidden states, and introduces a regularization term in the loss function by relying on Lyapunov stability theory, which enables the placement of eigenvalues of the transfer function induced by the DCRNN to desired values, thereby acting as an internal controller for the hidden state trajectory.
Compositional Imitation Learning: Explaining and executing one task at a time
We introduce a framework for Compositional Imitation Learning and Execution (CompILE) of hierarchically-structured behavior. CompILE learns reusable, variable-length segments of behavior from
...
...

References

SHOWING 1-10 OF 34 REFERENCES
Recurrent Environment Simulators
TLDR
This work addresses the issue of computationally inefficiency with a model that does not need to generate a high-dimensional image at each time-step and can be used to improve exploration and is adaptable to many diverse environments.
Scheduled Sampling for Sequence Prediction with Recurrent Neural Networks
TLDR
This work proposes a curriculum learning strategy to gently change the training process from a fully guided scheme using the true previous token, towards a less guided scheme which mostly uses the generated token instead.
Time-Agnostic Prediction: Predicting Predictable Video Frames
TLDR
This work decouple visual prediction from a rigid notion of time so that time-agnostic predictors (TAP) are not tied to specific times so that they may instead discover predictable "bottleneck" frames no matter when they occur.
Value Prediction Network
TLDR
This paper proposes a novel deep reinforcement learning architecture, called Value Prediction Network (VPN), which integrates model-free and model-based RL methods into a single neural network, which outperforms Deep Q-Network on several Atari games even with short-lookahead planning.
Recent Advances in Hierarchical Reinforcement Learning
TLDR
This work reviews several approaches to temporal abstraction and hierarchical organization that machine learning researchers have recently developed and discusses extensions of these ideas to concurrent activities, multiagent coordination, and hierarchical memory for addressing partial observability.
Learning and Querying Fast Generative Models for Reinforcement Learning
TLDR
It is demonstrated that agents which query these models for decision making outperform strong model-free baselines on the game MSPACMAN, demonstrating the potential of using learned environment models for planning.
Temporal Difference Models: Model-Free Deep RL for Model-Based Control
Model-free reinforcement learning (RL) is a powerful, general tool for learning complex behaviors. However, its sample efficiency is often impractically large for solving challenging real-world
Sparse Attentive Backtracking: Long-Range Credit Assignment in Recurrent Networks
TLDR
This work proposes an alternative algorithm, Sparse Attentive Backtracking, which might also be related to principles used by brains to learn long-term dependencies, and learns an attention mechanism over the hidden states of the past and selectively backpropagates through paths with high attention weights.
The Predictron: End-To-End Learning and Planning
TLDR
The predictron consists of a fully abstract model, represented by a Markov reward process, that can be rolled forward multiple "imagined" planning steps that accumulates internal rewards and values over multiple planning depths.
...
...