# Static Analysis for Regular Expression Exponential Runtime via Substructural Logics

@article{Rathnayake2014StaticAF, title={Static Analysis for Regular Expression Exponential Runtime via Substructural Logics}, author={Asiri Rathnayake and Hayo Thielecke}, journal={ArXiv}, year={2014}, volume={abs/1405.7058} }

Regular expression matching using backtracking can have exponential runtime, leading to an algorithmic complexity attack known as REDoS in the systems security literature. [... ] Key Method We systematically construct a more accurate analysis by forming powers and products of transition relations and thereby reducing the REDoS problem to reachability. The correctness of the analysis is proved using a substructural calculus of search trees, where the branching of the tree causing exponential blowup is… Expand

## Figures from this paper

## 27 Citations

### Sound Static Analysis of Regular Expressions for Vulnerabilities to Denial of Service Attacks

- Computer ScienceTASE
- 2022

A framework based on a tree semantics to statically identify ReDoS vulnerabilities is introduced and an algorithm to extract an overapproximation of the set of words that are dangerous for a regular expression is put forward, effectively catching all possible attacks.

### Static analysis of regular expressions

- Computer Science
- 2017

A method for accurately modeling the matching time behaviour of a backtracking regular expression matcher, by using automata theoretic methods, is presented and analyzed by using the concept of ambiguity in nondeterministic finite-state automata.

### Regulator: Dynamic Analysis to Detect ReDoS

- Computer Science
- 2021

R EGULATOR is developed, a novel dynamic, fuzzer-based analysis system for identifying regexps vulnerable to ReDoS that is implemented by directly instrumenting a popular backtracking regexp engine, which increases the scope of supported regexp syntax and features over prior work.

### Analyzing Matching Time Behavior of Backtracking Regular Expression Matchers by Using Ambiguity of NFA

- Computer ScienceCIAA
- 2016

We apply results from ambiguity of non-deterministic finite automata to the problem of determining the asymptotic worst-case matching time, as a function of the length of the input strings, when…

### ReGiS: Regular Expression Simplification via Rewrite-Guided Synthesis

- Computer ScienceArXiv
- 2021

This work presents a new approach called rewrite-guided synthesis (ReGiS), in which a unique interplay between SyGuS and equality saturation-based rewriting helps to overcome problems, resulting in an efficient, scalable framework for expression simplification.

### Rethinking Regex engines to address ReDoS

- Computer ScienceESEC/SIGSOFT FSE
- 2019

It is reported that about 95% of regexes in popular programming languages can be evaluated in linear time, and it is described how the vast majority of regex matches can be made linear-time with minor, not major, changes to existing algorithms.

### Static Detection of DoS Vulnerabilities in Programs that Use Regular Expressions

- Computer ScienceTACAS
- 2017

This paper proposes a technique for automatically finding ReDoS vulnerabilities in programs and automatically identifies vulnerable regular expressions in the program and determines whether an "evil" input string can be matched against a vulnerable regular expression.

### Kleenex: compiling nondeterministic transducers to deterministic streaming transducers

- Computer SciencePOPL
- 2016

Kleenex, a language for expressing general nondeterministic finite transducers, and its novel compilation to streaming string transducers with essentially optimal streaming behavior, worst-case linear-time performance and sustained high throughput are presented.

### FlashRegex: Deducing Anti-ReDoS Regexes from Examples

- Computer Science2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE)
- 2020

This paper proposes a programming-by-example framework, FlashRegex, for generating anti-ReDoS regexes by either synthesizing or repairing from given examples, and is the first framework that integrates regex synthesis and repair with the awareness of ReDoS-vulnerabilities.

### Turning evil regexes harmless

- Computer ScienceSAICSIT '17
- 2017

The relationship between ambiguity in automata and regular expressions and the matching time of backtracking regular expression matchers is explored, and techniques to reduce or remove ambiguity from regular expressions are investigated.

## References

SHOWING 1-10 OF 37 REFERENCES

### Checking Time Linearity of Regular Expression Matching Based on Backtracking

- Computer Science
- 2014

A method of checking whether or not regular expression matching runs in linear time by constructing a top-down tree transducer with regular lookahead that translates the input string into a tree corresponding to the execution steps of matching based on backtracking.

### Static Analysis for Regular Expression Denial-of-Service Attacks

- Computer ScienceNSS
- 2013

Testing the analysis on two large repositories of regular expressions shows that the analysis is able to find significant numbers of vulnerable regular expressions in a matter of seconds, and has a firm theoretical foundation in abstract machines.

### Deciding ML typability is complete for deterministic exponential time

- Computer SciencePOPL '90
- 1989

It is conjecture that lower bounds on deciding typability for extensions to the typed lambda calculus can be regarded precisely in terms of this expressive capacity for succinct function composition, which results in a proof of DEXPTIME-hardness.

### Proof-directed debugging

- MathematicsJournal of Functional Programming
- 1999

The interplay between programming and proving is illustrated in the development of a program for regular expression matching by giving a plausible implementation of a regular expression matcher that contains a flaw that is uncovered in an attempt to prove its correctness.

### Regular expression sub-matching using partial derivatives

- Computer SciencePPDP
- 2012

The novel use of derivatives and partial derivatives for regular expression sub-matching is proposed and benchmarking results show that the run-time performance is promising and that the approach can be applied in practice.

### Regular expression containment: coinductive axiomatization and computational interpretation

- Computer SciencePOPL '11
- 2011

This work presents a new sound and complete axiomatization of regular expression containment, and shows how to encode regular expression equivalence proofs in Salomaa's, Kozen's and Grabmayer's axomatizations into their containment system, which equips their axiomatsizations with a computational interpretation and implies completeness of the axiom atization.

### Context logic and tree update

- Computer SciencePOPL '05
- 2005

Context Logic is introduced, studied, and used to reason locally about a small imperative programming language for updating trees, using a Hoare logic in the style of O'Hearn, Reynolds and Yang, and it is shown that weakest preconditions are derivable.

### Derivatives of Regular Expressions

- MathematicsJACM
- 1964

In this paper the notion of a derivative of a regular expression is introduced atld the properties of derivatives are discussed and this leads, in a very natural way, to the construction of a state diagram from a regularexpression containing any number of logical operators.

### iNFAnt: NFA pattern matching on GPGPU devices

- Computer ScienceCCRV
- 2010

iNFAnt is explicitly designed and developed to run on graphical processing units that provide large amounts of concurrent threads; this parallelism is exploited to handle the non-determinism of the model and to process multiple packets at once, thus achieving high performance levels.

### BI as an assertion language for mutable data structures

- PhilosophyPOPL '01
- 2001

A model in which the law of the excluded middleholds is given is given, thus showing that the approach is compatible with classical logic, and a local character enjoyed by specifications in the logic is described, which enables a class of frame axioms, which say what parts of the heap don't change, to be inferred automatically.