Discovering Useful Compact Sets of Sequential Rules in a Long Sequence

  title={Discovering Useful Compact Sets of Sequential Rules in a Long Sequence},
  author={Erwan Bourrand and Luis Gal'arraga and Esther Galbrun and {\'E}lisa Fromont and Alexandre Termier},
  journal={2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI)},
We are interested in understanding the underlying generation process for long sequences of symbolic events. To do so, we propose COSSU, an algorithm to mine small and meaningful sets of sequential rules. The rules are selected using an MDL-inspired criterion that favors compactness and relies on a novel rule-based encoding scheme for sequences. Our evaluation shows that COSSU can successfully retrieve relevant sets of closed sequential rules from a long sequence. Such rules constitute an… 

Figures and Tables from this paper


Using Partially-Ordered Sequential Rules to Generate More Accurate Sequence Prediction
Experiments on large click-stream datasets for webpage recommendation show that using a new type of sequential rules named partially-ordered sequential rules can greatly increase prediction accuracy, while requiring a smaller training set.
Efficiently Summarising Event Sequences with Rich Interleaving Patterns
This paper proposes SQUISH, a novel greedy MDL-based method for summarising sequential data using rich patterns that are allowed to interleave, and shows how this results in better models, as well as discovers meaningful semantics in the form patterns that identify multiple choices of values.
The long and the short of it: summarising event sequences with serial episodes
This paper formalises how to encode sequential data using sets of serial episodes, and uses the encoded length as a quality score to identify the set of sequential patterns that summarises the data best.
RuleGrowth: mining sequential rules common to several sequences by pattern-growth
This paper presents RuleGrowth, a novel algorithm for mining sequential rules common to several sequences that uses a pattern-growth approach for discovering sequential rules such that it can be much more efficient and scalable.
Keeping it Short and Simple: Summarising Complex Event Sequences with Multivariate Patterns
Ditto, a highly efficient algorithm that approximates the ideal result very well, is introduced, and it scales favourably with the length of the data, the number of attributes, the alphabet sizes.
ERMiner: Sequential Rule Mining Using Equivalence Classes
An algorithm named ERMiner (Equivalence class based sequential Rule Miner) is proposed, which relies on the novel idea of searching using equivalence classes of rules having the same antecedent or consequent to prune the search space.
Mining closed strict episodes
  • Nikolaj Tatti, B. Cule
  • Computer Science, Mathematics
    2010 IEEE International Conference on Data Mining
  • 2010
This work introduces a technique for discovering closed episodes by introducing strict episodes, and argues that this class is general enough, and at the same time is able to define a natural subset relationship within it and use it efficiently.
Mining Association Rules in Long Sequences
This paper presents an efficient algorithm to mine confident association rules within patterns, and concludes that it indeed gives intuitive results in a number of applications.
Discovery of Meaningful Rules in Time Series
This work shows why the ideas of symbolic stream rule discovery are not directly suitable for rule discovery in time series, and presents novel algorithms that allow us to quickly discover high quality rules in very large datasets that accurately predict the occurrence of future events.
Sets of Robust Rules, and How to Find Them
This paper defines the problem of association rule mining in terms of the Minimum Description Length principle, and proposes Grab, a greedy heuristic to efficiently discover good sets of noise-resistant rules directly from data.