• Corpus ID: 235377213

Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL

  title={Who Is the Strongest Enemy? Towards Optimal and Efficient Evasion Attacks in Deep RL},
  author={Yanchao Sun and Ruijie Zheng and Yongyuan Liang and Furong Huang},
Evaluating the worst-case performance of a reinforcement learning (RL) agent under the strongest/optimal adversarial perturbations on state observations (within some constraints) is crucial for understanding the robustness of RL agents. However, finding the optimal adversary is challenging, in terms of both whether we can find the optimal attack and how efficiently we can find it. Existing works on adversarial RL either use heuristics-based methods that may not find the strongest adversary, or… 

Certifiably Robust Policy Learning against Adversarial Communication in Multi-agent Systems

This work considers an environment with N agents, where the attacker may arbitrarily change the communication from any C < N − 1 2 agents to a victim agent, and proposes a certifiable defense by constructing a message-ensemble policy that aggregates multiple randomly ablated message sets.

CROP: Certifying Robust Policies for Reinforcement Learning through Functional Smoothing

This paper presents the first unified framework CROP (Certifying Robust Policies for RL) to provide robustness certification on both action and reward levels, and proposes two robustness Certification criteria: robustness of per-state actions and lower bound of cumulative rewards.

Reinforcement Learning for Feedback-Enabled Cyber Resilience

Efficient Adversarial Training without Attacking: Worst-Case-Aware Robust Reinforcement Learning

This work proposes a strong and efficient robust training framework for RL, named Worst-case-aware Robust RL (WocaR-RL), that directly estimates and optimizes the worst-case reward of a policy under bounded attacks without requiring extra samples for learning an attacker.

Trustworthy Reinforcement Learning Against Intrinsic Vulnerabilities: Robustness, Safety, and Generalizability

This study aims to overview these main perspectives of trustworthy reinforcement learning considering its intrinsic vulnerabilities on robustness, safety, and generalizability, and gives rigorous formulations, categorize corresponding methodologies, and discuss benchmarks for each perspective.

SoK: Adversarial Machine Learning Attacks and Defences in Multi-Agent Reinforcement Learning

A novel perspective to understand the manner of perpetrating an AML attack, by defining Attack Vectors is proposed, and two new frameworks to address a gap in current modelling frameworks are developed.

A Survey on Reinforcement Learning Security with Application to Autonomous Driving

Reinforcement learning is used in safety-critical applications, such as autonomous driving, despite being vulnerable to attacks carefully crafted to either prevent or prevent that the reinforcement learning algorithm learns an effective and reliable policy.

Distributional Reward Estimation for Effective Multi-Agent Deep Reinforcement Learning

A novel Distributional Reward Estimation framework for effective Multi-Agent Reinforcement Learning (DRE-MARL) is proposed, to design the multi-action-branch reward estimation and policy-weighted reward aggregation for stabilized training.



Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning

The results show that the attacker can easily succeed in teaching any target policy to the victim under mild conditions and highlight a significant security threat to reinforcement learning agents in practice.

Tactics of Adversarial Attack on Deep Reinforcement Learning Agents

A novel method to determine when an adversarial example should be crafted and applied is proposed, which applies to agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C.

Adversarial Policies: Attacking Deep Reinforcement Learning

The existence of adversarial policies in zero-sum games between simulated humanoid robots with proprioceptive observations, against state-of-the-art victims trained via self-play to be robust to opponents is demonstrated.

Robust Deep Reinforcement Learning through Adversarial Loss

RADIAL-RL is proposed, a method to train reinforcement learning agents with improved robustness against any $l_p$-bounded adversarial attack, and a new evaluation method, Greedy Worst-Case Reward (GWC), for measuring attack agnostic robustness of RL agents.

Optimal Attacks on Reinforcement Learning Policies

This paper investigates the problem of devising optimal attacks, depending on a well-defined attacker's objective, and demonstrates that using Reinforcement Learning techniques tailored to POMDP leads to more resilient policies.

Deceptive Reinforcement Learning Under Adversarial Manipulations on Cost Signals

This paper studies reinforcement learning under malicious falsification on cost signals and introduces a quantitative framework of attack models to understand the vulnerabilities of RL and proposes a robust region in terms of the cost within which the adversary can never achieve the targeted policy.

Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations

The state-adversarialMarkov decision process (SA-MDP) is proposed, and a theoretically principled policy regularization is developed which can be applied to a large family of DRL algorithms, including proximal policy optimization (PPO), deep deterministic policy gradient (DDPG) and deep Q networks (DQN), for both discrete and continuous action control problems.

Robust Adversarial Reinforcement Learning

RARL is proposed, where an agent is trained to operate in the presence of a destabilizing adversary that applies disturbance forces to the system and the jointly trained adversary is reinforced - that is, it learns an optimal destabilization policy.

Vulnerability-Aware Poisoning Mechanism for Online RL with Unknown Dynamics

This work proposes a strategic poisoning algorithm called Vulnerability-Aware Adversarial Critic Poison (VA2C-P), which works for most policy-based deep RL agents, using a novel metric, stability radius in RL, that measures the vulnerability of RL algorithms.

Query-based targeted action-space adversarial policies on deep reinforcement learning agents

This work investigates targeted attacks in the action-space domain (actuation attacks), which perturbs the outputs of a controller, and proposes the use of adversarial training with transfer learning to induce robust behaviors into the nominal policy, which decreases the rate of successful targeted attacks.