Context Aware Trace Clustering: Towards Improving Process Mining Results


Process Mining refers to the extraction of process models from event logs. Real-life processes tend to be less structured and more flexible. Traditional process mining algorithms have problems dealing with such unstructured processes and generate spaghetti-like process models that are hard to comprehend. An approach to overcome this is to cluster process instances (a process instance is manifested as a trace and an event log corresponds to a multi-set of traces) such that each of the resulting clusters correspond to a coherent set of process instances that can be adequately represented by a process model. In this paper, we propose a context aware approach to trace clustering based on generic edit distance. It is well known that the generic edit distance framework is highly sensitive to the costs of edit operations. We define an automated approach to derive the costs of edit operations. The method proposed in this paper outperforms contemporary approaches to trace clustering in process mining. We evaluate the goodness of the formed clusters using established fitness and comprehensibility metrics defined in the context of process mining. The proposed approach is able to generate clusters such that the process models mined from the clustered traces show a high degree of fitness and comprehensibility when compared to contem-

DOI: 10.1137/1.9781611972795.35

Extracted Key Phrases

10 Figures and Tables

Citations per Year

110 Citations

Semantic Scholar estimates that this publication has 110 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Bose2009ContextAT, title={Context Aware Trace Clustering: Towards Improving Process Mining Results}, author={R. P. Jagadeesh Chandra Bose and Wil M. P. van der Aalst}, booktitle={SDM}, year={2009} }