# Intrusion Prevention through Optimal Stopping

@article{Hammar2021IntrusionPT, title={Intrusion Prevention through Optimal Stopping}, author={Kim Hammar and Rolf Stadler}, journal={ArXiv}, year={2021}, volume={abs/2111.00289} }

—We study automated intrusion prevention using reinforcement learning. Following a novel approach, we formulate the problem of intrusion prevention as an (optimal) multiple stopping problem. This formulation gives us insight into the structure of optimal policies, which we show to have threshold properties. For most practical cases, it is not feasible to obtain an optimal defender policy using dynamic programming. We therefore develop a reinforcement learning approach to approximate an optimal…

## Figures and Tables from this paper

## 4 Citations

### A System for Interactive Examination of Learned Security Policies

- Computer ScienceNOMS 2022-2022 IEEE/IFIP Network Operations and Management Symposium
- 2022

The system enables insight into the structure of a given policy and in the behavior of a policy in edge cases and examines the evolution of an IT infrastructure’s state and the actions prescribed by security policies while an attack occurs.

### Adaptive threat mitigation in SDN using improved D3QN

- Computer ScienceCloud Computing and Mechatronic Engineering
- 2022

The experimental results are evaluated, and the convergence results of the improved algorithm model are given, which shows the availability of reinforcement learning methods for adaptive threat mitigation in the SDN environment.

### Developing Optimal Causal Cyber-Defence Agents via Cyber Security Simulation

- Computer Science
- 2022

This paper proposes that DCBO can act as a blue agent when provided with a view of a simulated network and a causal model of how a red agent spreads within that network and provides numerical results which lay the foundations for future work in this space.

### Learning Security Strategies through Game Play and Optimal Stopping

- Computer ScienceArXiv
- 2022

The interaction between an attacker and a defender is formulated as an optimal stopping game and let attack and defense strategies evolve through reinforcement learning and self-play to produce effective defender strategies for a practical IT infrastructure.

## References

SHOWING 1-10 OF 99 REFERENCES

### Online Cyber-Attack Detection in Smart Grid: A Reinforcement Learning Approach

- Engineering, Computer ScienceIEEE Transactions on Smart Grid
- 2019

This paper forms the online attack/anomaly detection problem as a partially observable Markov decision process (POMDP) problem and proposes a universal robust online detection algorithm using the framework of model-free reinforcement learning (RL) for POMDPs.

### The Complexity of Markov Decision Processes

- Computer ScienceMath. Oper. Res.
- 1987

All three variants of the classical problem of optimal policy computation in Markov decision processes, finite horizon, infinite horizon discounted, and infinite horizon average cost are shown to be complete for P, and therefore most likely cannot be solved by highly parallel algorithms.

### Partially Observed Markov Decision Processes: From Filtering to Controlled Sensing

- Computer Science, Mathematics
- 2016

A tutorial on partially observable markov decision processes, a tutorial on controlled stochastic process encyclopedia of mathematics, and optimal control of partially observable piecewise.

### Proximal Policy Optimization Algorithms

- Computer ScienceArXiv
- 2017

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective…

### Detection of intrusions in information systems by sequential change-point methods

- Computer Science
- 2006

### Learning Intrusion Prevention Policies through Optimal Stopping

- Computer Science2021 17th International Conference on Network and Service Management (CNSM)
- 2021

This work forms the problem of intrusion prevention as an optimal stopping problem and approximate the optimal policy through reinforcement learning in a simulation environment, showing that the learned policies are close to optimal and that they indeed can be expressed using thresholds.

### DeepAir: Deep Reinforcement Learning for Adaptive Intrusion Response in Software-Defined Networks

- Computer ScienceIEEE Transactions on Network and Service Management
- 2022

### Deep hierarchical reinforcement agents for automated penetration testing

- Computer ScienceArXiv
- 2021

A novel deep reinforcement learning architecture with hierarchically structured agents called HA-DRL, which employs an algebraic action decomposition strategy to address the large discrete action space of an autonomous penetration testing simulator where the number of actions is exponentially increased with the complexity of the designed cybersecurity network.

### Using Cyber Terrain in Reinforcement Learning for Penetration Testing

- Computer Science2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS)
- 2022

It is shown that terrain analysis can be used to bring realism to attack graphs for RL and is used in an attack graph with roughly 1000 vertices and 2300 edges and deep Q reinforcement learning with experience replay to demonstrate the method.

### Towards Autonomous Defense of SDN Networks Using MuZero Based Intelligent Agents

- Computer ScienceIEEE Access
- 2021

The results show that the defender is capable of deciding which measures minimize the impact of the intrusion, isolating the attacker and preventing it from compromising key machines in the network.