Scalable Process Discovery with Guarantees

@inproceedings{Leemans2015ScalablePD,
  title={Scalable Process Discovery with Guarantees},
  author={Sander J. J. Leemans and Dirk Fahland and Wil M.P. van der Aalst},
  booktitle={BMMDS/EMMSAD},
  year={2015}
}
Considerable amounts of data, including process event data, are collected and stored by organisations nowadays. Discovering a process model from recorded process event data is the aim of process discovery algorithms. Many techniques have been proposed, but none combines scalability with quality guarantees, e.g. can handle billions of events or thousands of activities, and produces sound models (without deadlocks and other anomalies), and guarantees to rediscover the underlying process in some… 
Scalable process discovery and conformance checking
TLDR
This paper introduces a framework for process discovery that ensures these properties while passing over the log only once and introduces three algorithms using the framework and introduces a model–model and model–log comparison framework that applies a divide-and-conquer strategy to measure recall, fitness, and precision.
How Much Event Data Is Enough? A Statistical Framework for Process Discovery
TLDR
This paper presents a framework for process discovery that relies on statistical pre-processing of an event log and significantly reduce its size by means of sampling, which reduces the runtime and memory footprint of process discovery algorithms, while providing guarantees on the introduced sampling error.
Event stream-based process discovery using abstract representations
TLDR
This paper proposes a generic architecture that allows for adopting several classes of existing process discovery techniques in context of event streams and provides several instantiations of the architecture accompanied by implementations in the process mining toolkit ProM (http://promtools.org).
Efficient Event Correlation over Distributed Systems
TLDR
This paper proposes a new algorithm, called RF-GraP, which provides a more efficient correlation over distributed systems and is able to achieve significant performance speedups with obviously less network communication.
Process mining with streaming data
TLDR
This thesis explores, develop and analyse process mining techniques that are able to handle streaming event data and identifies three main process mining types of analysis, i.e. process discovery, conformance checking and process enhancement.
Fusion-Based Process Discovery
TLDR
This work argues that, instead of relying on a single algorithm, the outcomes of different algorithms shall be fused to combine the strengths of individual approaches, and proposes a general framework for such fusion and instantiate two new discovery algorithms: The Exhaustive Noise-aware Inductive Miner (exNoise), which, exhaustively searches for model improvements; and the Adaptive Noise, a computationally tractable version of exNoise.
Automated Discovery of Process Models from Event Logs: Review and Benchmark
TLDR
The results highlight gaps and unexplored tradeoffs in the field, including the lack of scalability of some methods and a strong divergence in their performance with respect to the different quality metrics used.
All That Glitters Is Not Gold: Towards Process Discovery Techniques with Guarantees
TLDR
This paper distinguishes four incremental stages for the development of process discovery algorithms with properties that relate qualities of their inputs to those of their outputs, along with concrete guidelines for the formulation of relevant properties and experimental validation.
Non-Local Correction of Process Models Using Event Logs
  • A. A. Mitsyuk
  • Computer Science
    2017 Ivannikov ISPRAS Open Conference (ISPRAS)
  • 2017
TLDR
An algorithm of non-local process correction, that employs the two fundamental algorithmic paradigms: “divide and conquer” and “greedy processing”, is described, that decomposes the process model and repairs sub-models in a greedy way using event logs with actual behavior.
Process discovery from event data: Relating models and logs through abstractions
TLDR
This article discusses four discovery approaches involving three abstractions and different types of process models (Petri nets, block-structured models, and declarative models) and aims to unify existing approaches by focusing on log and model abstractions.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 30 REFERENCES
Discovering Block-Structured Process Models from Incomplete Event Logs
TLDR
This paper introduces probabilistic behavioural relations that are less sensitive to incompleteness, and exploits these relations to provide a more robust process discovery algorithm that proves to be able to rediscover a model of the original system.
Scalable Process Discovery Using Map-Reduce
  • Joerg Evermann
  • Computer Science
    IEEE Transactions on Services Computing
  • 2016
TLDR
This paper presents Map- Reduce implementations of two well-known process mining algorithms to take advantage of the scalability of the Map-Reduce approach and presents the design of a series of mappers and reducers to compute the log-based ordering relations from distributed event logs.
Discovering Block-Structured Process Models from Event Logs - A Constructive Approach
TLDR
This work provides an extensible framework to discover from any given log a set of block-structured process models that are sound and fit the observed behaviour, and gives sufficient conditions on the log for which the algorithm returns a model that is language-equivalent to the process model underlying the log, including unseen behaviour.
Discovering Block-Structured Process Models from Event Logs Containing Infrequent Behaviour
TLDR
This work presents a technique able to cope with infrequent behaviour and large event logs, while ensuring soundness, and compares the technique with existing approaches in terms of quality and performance.
Scalable Dynamic Business Process Discovery with the Constructs Competition Miner
TLDR
A set of modifications are proposed for the CCM to enable scalable dynamic business process discovery of a run-time process model from a stream of events and investigate the behaviour of the algorithm on event streams of dynamically changing processes.
Workflow mining: discovering process models from event logs
TLDR
A new algorithm is presented to extract a process model from a so-called "workflow log" containing information about the workflow process as it is actually being executed and represent it in terms of a Petri net.
Replaying history on process models for conformance checking and performance analysis
TLDR
The importance of maintaining a proper alignment between event log and process model is elaborated on and their application to conformance checking and performance analysis is elaborated.
Mining process models with non-free-choice constructs
TLDR
This paper proposes an algorithm that is able to deal with both kinds of causal dependencies between tasks, i.e., explicit and implicit ones, and implements it in the ProM framework and experimental results shows that the algorithm indeed significantly improves existing process mining techniques.
Constructs Competition Miner: Process Control-Flow Discovery of BP-Domain Constructs
TLDR
This paper proposes an algorithm that follows a top-down approach to directly mine a process model which consists of common BP-domain constructs and represents the main behaviour of the process.
Mining Invisible Tasks from Event Logs
TLDR
This paper proposes a new process mining algorithm named α #, which extends the mining capacity of the classical α algorithm by supporting the detection of invisible tasks from event logs by introducing a new ordering relation for detecting mendacious dependencies between tasks.
...
1
2
3
...