Corpus ID: 31009408

# Distral: Robust multitask reinforcement learning

@inproceedings{Teh2017DistralRM,
author={Yee Whye Teh and Victor Bapst and Wojciech M. Czarnecki and John Quan and James Kirkpatrick and Raia Hadsell and Nicolas Manfred Otto Heess and Razvan Pascanu},
booktitle={NIPS},
year={2017}
}
Most deep reinforcement learning algorithms are data inefficient in complex and rich environments, limiting their applicability to many scenarios. [...] Key Method Each worker is trained to solve its own task while constrained to stay close to the shared policy, while the shared policy is trained by distillation to be the centroid of all task policies. Both aspects of the learning process are derived by optimizing a joint objective function. We show that our approach supports efficient transfer on complex 3D…Expand
306 Citations

#### Figures, Tables, and Topics from this paper

Multi-task Deep Reinforcement Learning with PopArt
• Computer Science, Mathematics
• AAAI
• 2019
This work proposes to automatically adapt the contribution of each task to the agent’s updates, so that all tasks have a similar impact on the learning dynamics, and learns a single trained policy that exceeds median human performance on this multi-task domain. Expand
Sharing Experience in Multitask Reinforcement Learning
A Sharing Experience Framework for simultaneously training of multiple tasks by using task-specific rewards from the environment to identify similar parts that should be shared across tasks and defines those parts as shared-regions between tasks. Expand
• 2019
Sharing knowledge between tasks is vital for efficient learning in a multi-task setting. However, most research so far has focused on the easier case where knowledge transfer is not harmful, i.e.,Expand
Multi-task Deep Reinforcement Learning: a Combination of Rainbow and DisTraL
• 2020 6th Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS)
• 2020
Among the previous studies on deep reinforcement learning (DRL) challenges, optimizing data usage efficiency in rich and complex environments has been discussed by many researchers. Recently manyExpand
• Computer Science, Mathematics
• ArXiv
• 2018
The proposed framework is based on differential policy gradients and can accommodate multi-task learning in a single actor-critic network and a simple heuristic in the differential policy gradient update is proposed to further improve the learning. Expand
While deep reinforcement learning systems have demonstrated impressive results in domains ranging from game playing and robotic control, sample efficiency remains a major challenge, particularly asExpand
Mutual Information Based Knowledge Transfer Under State-Action Dimension Mismatch
• Computer Science, Mathematics
• UAI
• 2020
Successful transfer learning is demonstrated in situations when the teacher and student have different state- and action-spaces, and embeddings are produced which can systematically extract knowledge from the teacher policy and value networks, and blend it into the student networks. Expand
Conservative Data Sharing for Multi-Task Offline Reinforcement Learning
• Tianhe Yu
• Computer Science
• ArXiv
• 2021
This work develops a simple technique for data-sharing in multi-task offline RL that routes data based on the improvement over the task-specific data and achieves the best or comparable performance compared to prior offline multi- task RL methods and previous data sharing approaches. Expand
• Computer Science, Mathematics
• ECML/PKDD
• 2019
This work presents an approach to multi-task deep reinforcement learning based on attention that does not require any a-priori assumptions about the relationships between tasks and achieves positive knowledge transfer if possible, and avoids negative transfer in cases where tasks interfere. Expand
Fractional Transfer Learning for Deep Model-Based Reinforcement Learning
• Computer Science
• ArXiv
• 2021
Fractional transfer learning is presented, the idea is to transfer fractions of knowledge, opposed to discarding potentially useful knowledge as is commonly done with random initialization, using the World Model-based Dreamer algorithm. Expand

#### References

SHOWING 1-10 OF 37 REFERENCES
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning
• Computer Science, Mathematics
• ICLR
• 2016
This work defines a novel method of multitask and transfer learning that enables an autonomous agent to learn how to behave in multiple tasks simultaneously, and then generalize its knowledge to new domains, and uses Atari games as a testing environment to demonstrate these methods. Expand
Reinforcement Learning with Unsupervised Auxiliary Tasks
This paper significantly outperforms the previous state-of-the-art on Atari, averaging 880\% expert human performance, and a challenging suite of first-person, three-dimensional \emph{Labyrinth} tasks leading to a mean speedup in learning of 10$\times$ and averaging 87\% Expert human performance on Labyrinth. Expand
Policy Distillation
A novel method called policy distillation is presented that can be used to extract the policy of a reinforcement learning agent and train a new network that performs at the expert level while being dramatically smaller and more efficient. Expand
Deep Reinforcement Learning with Double Q-Learning
• Computer Science
• AAAI
• 2016
This paper proposes a specific adaptation to the DQN algorithm and shows that the resulting algorithm not only reduces the observed overestimations, as hypothesized, but that this also leads to much better performance on several games. Expand
An Introduction to Intertask Transfer for Reinforcement Learning
• Computer Science
• AI Mag.
• 2011
This article focuses on transfer in the context of reinforcement learning domains, a general learning framework where an agent acts in an environment to maximize a reward signal. Expand
End-to-End Training of Deep Visuomotor Policies
• Computer Science, Mathematics
• J. Mach. Learn. Res.
• 2016
This paper develops a method that can be used to learn policies that map raw image observations directly to torques at the robot's motors, trained using a partially observed guided policy search method, with supervision provided by a simple trajectory-centric reinforcement learning method. Expand
Human-level control through deep reinforcement learning
This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks. Expand
Playing FPS Games with Deep Reinforcement Learning
• Computer Science
• AAAI
• 2017
This paper presents the first architecture to tackle 3D environments in first-person shooter games, that involve partially observable states, and substantially outperforms built-in AI agents of the game as well as humans in deathmatch scenarios. Expand
Reinforcement Learning with Deep Energy-Based Policies
• Computer Science
• ICML
• 2017
A method for learning expressive energy-based policies for continuous states and actions, which has been feasible only in tabular domains before, is proposed and a new algorithm, called soft Q-learning, that expresses the optimal policy via a Boltzmann distribution is applied. Expand
Taming the Noise in Reinforcement Learning via Soft Updates
• Computer Science, Mathematics
• UAI
• 2016
G-learning is proposed, a new off-policy learning algorithm that regularizes the noise in the space of optimal actions by penalizing deterministic policies at the beginning of the learning, which enables naturally incorporating prior distributions over optimal actions when available. Expand