Domain-Adversarial and -Conditional State Space Model for Imitation Learning

  title={Domain-Adversarial and -Conditional State Space Model for Imitation Learning},
  author={Ryogo Okumura and Masashi Okada and Tadahiro Taniguchi},
  journal={2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
State representation learning (SRL) in partially observable Markov decision processes has been studied to learn abstract features of data useful for robot control tasks. For SRL, acquiring domain-agnostic states is essential for achieving efficient imitation learning. Without these states, imitation learning is hampered by domain-dependent information useless for control. However, existing methods fail to remove such disturbances from the states when the data from experts and agents show large… 

Figures and Tables from this paper

Domain-Robust Visual Imitation Learning with Mutual Information Constraints

This paper introduces a new algorithm, Disentangling Generative Adversarial Imitation Learning (DisentanGAIL), which enables autonomous agents to learn directly from high dimensional observations of an expert performing a task, by making use of adversarial learning with a latent representation inside the discriminator network.

Dreaming: Model-based Reinforcement Learning by Latent Imagination without Reconstruction

This work aims to relieve this Dreamer's bottleneck and enhance its performance by means of removing the decoder, and derives a likelihood- free and InfoMax objective of contrastive learning from the evidence lower bound of Dreamer.

PlaNet of the Bayesians: Reconsidering and Improving Deep Planning Network by Incorporating Bayesian Inference

The proposed extension is to make PlaNet uncertainty-aware on the basis of Bayesian inference, in which both model and action uncertainty are incorporated, and it is concluded that the method can consistently improve the asymptotic performance compared with Pla net.

Semiotically adaptive cognition: toward the realization of remotely-operated service robots for the new normal symbiotic society

This paper argues that the development of semiotically adaptive cognitive systems is key to the installation of service robotics technologies in the authors' service environments and describes three challenges: the learning of local knowledge, the acceleration of onsite and online learning, and the augmentation of human–robot interactions.

Class Is Invariant to Context and Vice Versa: On Learning Invariance for Out-Of-Distribution Generalization

This work argues that the widely adopted assumption that the context bias can be directly annotated or estimated from biased class prediction—renders the context incomplete or even incorrect, and implements the ever-overlooked other side of the above principle: context is also invariant to class, which motivates to consider the classes as the varying environments to resolve context bias.

Tactile-Sensitive NewtonianVAE for High-Accuracy Industrial Connector Insertion

This work proposed tactile-sensitive Newtonian-VAE and applied it to a USB connector insertion with grasp pose variation in the physical environments and showed that the original NewtonianVAE fails in some situations, and demonstrated that domain knowledge induction improves model accuracy.



Learning Belief Representations for Imitation Learning in POMDPs

Evaluated on various partially observable continuous-control locomotion tasks, the belief-module imitation learning approach (BMIL) substantially outperforms several baselines, including the original GAIL algorithm and the task-agnostic belief learning algorithm.

End-to-End Differentiable Adversarial Imitation Learning

The Modelbased Generative Adversarial Imitation Learning (MGAIL) algorithm is introduced, which shows how to use a forward model to make the computation fully differentiable, which enables training policies using the exact gradient of the discriminator.

Imitation from Observation: Learning to Imitate Behaviors from Raw Video via Context Translation

This work proposes an imitation learning method based on video prediction with context translation and deep reinforcement learning that enables a variety of interesting applications, including learning robotic skills that involve tool use simply by observing videos of human tool use.

Directed-Info GAIL: Learning Hierarchical Policies from Unsegmented Demonstrations using Directed Information

This work discovers the interaction between sub-tasks from their resulting state-action trajectory sequences using a directed graphical model and proposes a new algorithm based on the generative adversarial imitation learning framework which automatically learns sub-task policies from unsegmented demonstrations.

Learning Latent Dynamics for Planning from Pixels

The Deep Planning Network (PlaNet) is proposed, a purely model-based agent that learns the environment dynamics from images and chooses actions through fast online planning in latent space using a latent dynamics model with both deterministic and stochastic transition components.

Generative Adversarial Imitation Learning

A new general framework for directly extracting a policy from data, as if it were obtained by reinforcement learning following inverse reinforcement learning, is proposed and a certain instantiation of this framework draws an analogy between imitation learning and generative adversarial networks.

Multi-Task Domain Adaptation for Deep Learning of Instance Grasping from Simulation

A multi-task domain adaptation framework for instance grasping in cluttered scenes by utilizing simulated robot experiments and uses a domain-adversarial loss to transfer the trained model to real robots using indiscriminate grasping data, which is available both in simulation and the real world.

InfoGAIL: Interpretable Imitation Learning from Visual Demonstrations

A new algorithm is proposed that can infer the latent structure of expert demonstrations in an unsupervised way, built on top of Generative Adversarial Imitation Learning, and can not only imitate complex behaviors, but also learn interpretable and meaningful representations of complex behavioral data, including visual demonstrations.

Learning human behaviors from motion capture by adversarial imitation

Generative adversarial imitation learning is extended to enable training of generic neural network policies to produce humanlike movement patterns from limited demonstrations consisting only of partially observed state features, without access to actions, even when the demonstrations come from a body with different and unknown physical parameters.

Adversarial Discriminative Domain Adaptation

It is shown that ADDA is more effective yet considerably simpler than competing domain-adversarial methods, and the promise of the approach is demonstrated by exceeding state-of-the-art unsupervised adaptation results on standard domain adaptation tasks as well as a difficult cross-modality object classification task.