• Corpus ID: 203642129

Learning Calibratable Policies using Programmatic Style-Consistency

@inproceedings{Zhan2019LearningCP,
  title={Learning Calibratable Policies using Programmatic Style-Consistency},
  author={Eric Zhan and Albert Tseng and Yisong Yue and Adith Swaminathan and Matthew J. Hausknecht},
  booktitle={International Conference on Machine Learning},
  year={2019}
}
We study the problem of controllable generation of long-term sequential behaviors, where the goal is to calibrate to multiple behavior styles simultaneously. In contrast to the well-studied areas of controllable generation of images, text, and speech, there are two questions that pose significant challenges when generating long-term behaviors: how should we specify the factors of variation to control, and how can we ensure that the generated behavior faithfully demonstrates combinatorially many… 

Task Programming: Learning Data Efficient Behavior Representations

TREBA is presented: a method to learn annotation-sample efficient trajectory embedding for behavior analysis, based on multi-task self-supervised learning and task programming, which uses programs to explicitly encode structured knowledge from domain experts.

baller2vec: A Multi-Entity Transformer For Multi-Agent Spatiotemporal Modeling

Multi-agent spatiotemporal modeling is a challenging task from both an algorithmic design and computational complexity perspective. Recent work has explored the efficacy of traditional deep

The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions

This work presents a multiagent dataset from behavioral neuroscience, the Caltech Mouse Social Interactions (CalMS21) Dataset, which consists of trajectory data of social interactions, recorded from videos of freely behaving mice in a standard residentintruder assay.

Unsupervised Learning of Neurosymbolic Encoders

This work integrates modern program synthesis techniques with the variational autoencoding (VAE) framework, in order to learn a neurosymbolic encoder in conjunction with a standard decoder, which leads to more interpretable and factorized latent representations compared to fully neural encoders.

Neurosymbolic Programming

This work surveys recent work on neurosymbolic programming, an emerging area that bridges the areas of deep learning and program synthesis and serves as a form of regularization and lead to more generalizable and data-efficient programming.

UniMASK: Unified Inference in Sequential Decision Problems

The Uni [MASK] framework is introduced, which provides a unified way to specify models which can be trained on many different sequential decision making tasks, and it is shown that a single Uni [ MASK] model is often capable of carrying out many tasks with performance similar to or better than single-task models.

The MABe22 Benchmarks for Representation Learning of Multi-Agent Behavior

This work introduces a large-scale, multi-agent trajectory dataset from real-world behavioral neuroscience experiments that covers a range of behavior analysis tasks and corresponds to behavioral representations that work across multiple organisms and is able to capture differences for common behaviorAnalysis tasks.

T OWARDS F LEXIBLE I NFERENCE IN S EQUENTIAL D ECISION P ROBLEMS VIA B IDIRECTIONAL T RANSFORMERS

The Flexi BiT framework is introduced, which provides a unified way to specify models which can be trained on many different sequential decision making tasks and shows that a single FlexiBiT model is simultaneously capable of carrying out many tasks with performance similar to or better than specialized models.

Towards Flexible Inference in Sequential Decision Problems via Bidirectional Transformers

The Flexi BiT framework is introduced, which provides a unified way to specify models which can be trained on many different sequential decision making tasks and shows that a single FlexiBiT model is simultaneously capable of carrying out many tasks with performance similar to or better than specialized models.

Synthesizing Video Trajectory Queries

We propose a novel framework called Q UIVR for example-based synthesis of queries to identify events of interest in video data; these queries are essentially regular expressions that operate over

References

SHOWING 1-10 OF 60 REFERENCES

Generating Multi-Agent Trajectories using Programmatic Weak Supervision

This work presents a hierarchical framework that can effectively learn sequential generative models for capturing coordinated multi-agent trajectory behavior, such as offensive basketball gameplay, and is inspired by recent work on leveraging programmatically produced weak labels, which it extends to the spatiotemporal regime.

Customizing Scripted Bots: Sample Efficient Imitation Learning for Human-like Behavior in Minecraft

It is demonstrated how to combine imitation learning with scripted agents in order to efficiently train hierarchical policies that improve upon the expressiveness of the original scripted agent, allowing more diverse and human-like behavior to emerge.

Dynamics-Aware Unsupervised Discovery of Skills

This work proposes an unsupervised learning algorithm, Dynamics-Aware Discovery of Skills (DADS), which simultaneously discovers predictable behaviors and learns their dynamics, and demonstrates that zero-shot planning in the learned latent space significantly outperforms standard MBRL and model-free goal-conditioned RL, and substantially improves over prior hierarchical RL methods for unsuper supervised skill discovery.

Generating Long-term Trajectories Using Deep Hierarchical Networks

This work proposes a hierarchical policy class that automatically reasons about both long-term and short-term goals, which is instantiate as a hierarchical neural network and generates significantly more realistic trajectories compared to non-hierarchical baselines as judged by professional sports analysts.

CARL: Controllable Agent with Reinforcement Learning for Quadruped Locomotion

CARL is presented, a quadruped agent that can be controlled with high-level directives and react naturally to dynamic environments and is evaluated by measuring the agent's ability to follow user control and providing a visual analysis of the generated motion to show its effectiveness.

Hierarchical Imitation and Reinforcement Learning

This work proposes an algorithmic framework, called hierarchical guidance, that leverages the hierarchical structure of the underlying problem to integrate different modes of expert interaction and can incorporate different combinations of imitation learning and reinforcement learning at different levels, leading to dramatic reductions in both expert effort and cost of exploration.

InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations

A new algorithm is proposed that can infer the latent structure of expert demonstrations in an unsupervised way, built on top of Generative Adversarial Imitation Learning, and can not only imitate complex behaviors, but also learn interpretable and meaningful representations of complex behavioral data, including visual demonstrations.

Robust Imitation of Diverse Behaviors

A new version of GAIL is developed that is much more robust than the purely-supervised controller, especially with few demonstrations, and avoids mode collapse, capturing many diverse behaviors when GAIL on its own does not.

Character controllers using motion VAEs

This work uses deep reinforcement learning to learn controllers that achieve goal-directed movements in data-driven generative models of human movement using autoregressive conditional variational autoencoders, or Motion VAEs.

Toward Controlled Generation of Text

A new neural generative model is proposed which combines variational auto-encoders and holistic attribute discriminators for effective imposition of semantic structures inGeneric generation and manipulation of text.
...