Intrusion Prevention through Optimal Stopping

@article{Hammar2021IntrusionPT,
  title={Intrusion Prevention through Optimal Stopping},
  author={Kim Hammar and Rolf Stadler},
  journal={ArXiv},
  year={2021},
  volume={abs/2111.00289}
}
—We study automated intrusion prevention using reinforcement learning. Following a novel approach, we formulate the problem of intrusion prevention as an (optimal) multiple stopping problem. This formulation gives us insight into the structure of optimal policies, which we show to have threshold properties. For most practical cases, it is not feasible to obtain an optimal defender policy using dynamic programming. We therefore develop a reinforcement learning approach to approximate an optimal… 

A System for Interactive Examination of Learned Security Policies

  • K. HammarR. Stadler
  • Computer Science
    NOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium
  • 2022
The system enables insight into the structure of a given policy and in the behavior of a policy in edge cases and examines the evolution of an IT infrastructure’s state and the actions prescribed by security policies while an attack occurs.

Adaptive threat mitigation in SDN using improved D3QN

The experimental results are evaluated, and the convergence results of the improved algorithm model are given, which shows the availability of reinforcement learning methods for adaptive threat mitigation in the SDN environment.

Developing Optimal Causal Cyber-Defence Agents via Cyber Security Simulation

This paper proposes that DCBO can act as a blue agent when provided with a view of a simulated network and a causal model of how a red agent spreads within that network and provides numerical results which lay the foundations for future work in this space.

Learning Security Strategies through Game Play and Optimal Stopping

The interaction between an attacker and a defender is formulated as an optimal stopping game and let attack and defense strategies evolve through reinforcement learning and self-play to produce effective defender strategies for a practical IT infrastructure.

References

SHOWING 1-10 OF 99 REFERENCES

Online Cyber-Attack Detection in Smart Grid: A Reinforcement Learning Approach

This paper forms the online attack/anomaly detection problem as a partially observable Markov decision process (POMDP) problem and proposes a universal robust online detection algorithm using the framework of model-free reinforcement learning (RL) for POMDPs.

The Complexity of Markov Decision Processes

All three variants of the classical problem of optimal policy computation in Markov decision processes, finite horizon, infinite horizon discounted, and infinite horizon average cost are shown to be complete for P, and therefore most likely cannot be solved by highly parallel algorithms.

Partially Observed Markov Decision Processes: From Filtering to Controlled Sensing

A tutorial on partially observable markov decision processes, a tutorial on controlled stochastic process encyclopedia of mathematics, and optimal control of partially observable piecewise.

Proximal Policy Optimization Algorithms

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective

Learning Intrusion Prevention Policies through Optimal Stopping

  • K. HammarR. Stadler
  • Computer Science
    2021 17th International Conference on Network and Service Management (CNSM)
  • 2021
This work forms the problem of intrusion prevention as an optimal stopping problem and approximate the optimal policy through reinforcement learning in a simulation environment, showing that the learned policies are close to optimal and that they indeed can be expressed using thresholds.

DeepAir: Deep Reinforcement Learning for Adaptive Intrusion Response in Software-Defined Networks

Deep hierarchical reinforcement agents for automated penetration testing

A novel deep reinforcement learning architecture with hierarchically structured agents called HA-DRL, which employs an algebraic action decomposition strategy to address the large discrete action space of an autonomous penetration testing simulator where the number of actions is exponentially increased with the complexity of the designed cybersecurity network.

Using Cyber Terrain in Reinforcement Learning for Penetration Testing

It is shown that terrain analysis can be used to bring realism to attack graphs for RL and is used in an attack graph with roughly 1000 vertices and 2300 edges and deep Q reinforcement learning with experience replay to demonstrate the method.

Towards Autonomous Defense of SDN Networks Using MuZero Based Intelligent Agents

The results show that the defender is capable of deciding which measures minimize the impact of the intrusion, isolating the attacker and preventing it from compromising key machines in the network.
...