• Corpus ID: 252596278

Human-AI Shared Control via Policy Dissection

  title={Human-AI Shared Control via Policy Dissection},
  author={Quanyi Li and Zhenghao Peng and Haibin Wu and Lan Feng and Bolei Zhou},
Human-AI shared control allows human to interact and collaborate with autonomous agents to accomplish control tasks in complex environments. Previous Reinforcement Learning (RL) methods attempted goal-conditioned designs to achieve human-controllable policies at the cost of redesigning the reward function and training paradigm. Inspired by the neuroscience approach to investigate the motor cortex in primates, we develop a simple yet effective frequency-based approach called Policy Dissection to… 



Proximal Policy Optimization Algorithms

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective

Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor

This paper proposes soft actor-critic, an off-policy actor-Critic deep RL algorithm based on the maximum entropy reinforcement learning framework, and achieves state-of-the-art performance on a range of continuous control benchmark tasks, outperforming prior on-policy and off- policy methods.

MetaDrive: Composing Diverse Driving Scenarios for Generalizable Reinforcement Learning

The generalization experiments conducted on both procedurally generated scenarios and real-world scenarios show that increasing the diversity and the size of the training set leads to the improvement of the RL agent's generalizability.

Learning to Walk in Minutes Using Massively Parallel Deep Reinforcement Learning

A training set-up that achieves fast policy generation for real-world robotic tasks by using massive parallelism on a single workstation GPU is presented and a novel game-inspired curriculum that is well suited for training with thousands of simulated robots in parallel is presented.

Inducing Functions through Reinforcement Learning without Task Specification

The experimental results show that high level functions, such as image classification and hidden variable estimation, can be naturally and simultaneously induced without any pre-training or specifying them.

Learning high-speed flight in the wild

This work demonstrates that end-to-end policies trained in simulation enable high-speed autonomous flight through challenging environments, outperforming traditional obstacle avoidance pipelines.

Safe Driving via Expert Guided Policy Optimization

This work develops a novel Expert Guided Policy Optimization (EGPO) method which integrates the guardian in the loop of reinforcement learning, composed of an expert policy to generate demonstration and a switch function to decide when to intervene.

Recent advances in leveraging human guidance for sequential decision-making tasks

This survey provides a high-level overview of five recent machine learning frameworks that primarily rely on human guidance apart from pre-specified reward functions or conventional, step-by-step action demonstrations.

Learning Vision-Guided Quadrupedal Locomotion End-to-End with Cross-Modal Transformers

Transformer achieves better generalization results when testing on unseen environments and in the real world, which shows the Transformer model provides an effective fusion mechanism between proprioceptive and visual information and new possibilities on reinforcement learning with information from multi-modality.

RMA: Rapid Motor Adaptation for Legged Robots

Rapid Motor Adaptation algorithm is presented to solve the problem of real-time online adaptation in quadruped robots by trained completely in simulation without using any domain knowledge like reference trajectories or predefined foot trajectory generators and deployed on the A1 robot without any fine-tuning.