Utility Mining Across Multi-Dimensional Sequences

  title={Utility Mining Across Multi-Dimensional Sequences},
  author={Wensheng Gan and Chun-Wei Lin and Jiexiong Zhang and Hongzhi Yin and Philippe Fournier-Viger and H. C. Chao and Philip S. Yu},
  journal={ACM Transactions on Knowledge Discovery from Data (TKDD)},
  pages={1 - 24}
Knowledge extraction from database is the fundamental task in database and data mining community, which has been applied to a wide range of real-world applications and situations. Different from the support-based mining models, the utility-oriented mining framework integrates the utility theory to provide more informative and useful patterns. Time-dependent sequence data are commonly seen in real life. Sequence data have been widely utilized in many applications, such as analyzing sequential… Expand
An Efficient Algorithm for Extracting High-Utility Hierarchical Sequential Patterns
This paper incorporates the hierarchical relation of items into HUSPM and proposes a two-phase algorithm MHUH, the first algorithm for high-utility hierarchical sequential pattern mining (HUHSPM), which extracts more interesting patterns with underlying informative knowledge efficiently in HUH SPM. Expand
On-Shelf Utility Mining of Sequence Data
Two methods are proposed, OSUM of sequence data (OSUMS) and OSUMS+, to extract on-shelf high-utility sequential patterns and substantial experimental results show that the two methods outperform the state-of-the-art algorithm. Expand
TUSQ: Targeted High-Utility Sequence Querying
A novel algorithm, namely targeted high-utility sequence querying (TUSQ), based on two novel upper bounds suffix remain utility and terminated descendants utility as well as a vertical Last Instance Table structure is developed. Expand
Explainable Fuzzy Utility Mining on Sequences
This study investigates explainable fuzzy-theoretic utility mining on multi-sequences and proposes a novel method termed pattern growth fuzzy utility mining (PGFUM), which achieves not only human-explainable mining results that contain the original nature of revealable intelligibility, but also high efficiency in terms of runtime and memory cost. Expand


A Survey of Utility-Oriented Pattern Mining
An in-depth understanding of UPM is introduced, including concepts, examples, and comparisons with related concepts, and a comprehensive review of advanced topics of existing high-utility pattern mining techniques is offered, with a discussion of their pros and cons. Expand
Applying the maximum utility measure in high utility sequential pattern mining
A maximum utility measure is presented, derived from the principle of traditional sequential pattern mining that the count of a subsequence in the sequence is only regarded as one, which is properly used to simplify the utility calculation for subsequences in mining. Expand
High utility pattern mining over data streams with sliding window technique
An algorithm for mining high utility patterns from resource-limited environments through efficient processing of data streams in order to solve the problems of the overestimation-based methods is proposed and two techniques for reducing overestimated utilities are developed. Expand
Constraint-Based Multidimensional Data Mining
Together, constraint-based and multidimensional techniques can provide a more ad hoc, query-driven process that effectively exploits the semantics of data than those supported by current standalone data-mining systems. Expand
On Incremental High Utility Sequential Pattern Mining
The IncUSP-Miner+ algorithm to mine HUSPs incrementally is proposed, with a tighter upper bound of the utility of a sequence, called Tight Sequence Utility (TSU), and a novel data structure to buffer the sequences whose TSU values are greater than or equal to the minimum utility threshold in the original database. Expand
USpan: an efficient algorithm for mining high utility sequential patterns
This paper introduces the lexicographic quantitative sequence tree to extract the complete set of high utility sequences and design concatenation mechanisms for calculating the utility of a node and its children with two effective pruning strategies, and defines a generic framework for high utility sequence mining. Expand
HUOPM: High-Utility Occupancy Pattern Mining
Results show that the derived patterns are intelligible, reasonable, and acceptable, and that HUOPM with its pruning strategies outperforms the state-of-the-art algorithm, in terms of runtime and search space, respectively. Expand
Mining sequential patterns from multidimensional sequence data
To mine sequential patterns from this kind of sequence data, two efficient algorithms have been developed in This work. Expand
Efficient algorithms for mining high-utility itemsets in uncertain databases
A novel framework, named potential high-utility itemset mining (PHUIM) in uncertain databases, is proposed to efficiently discover not only the itemset with high utilities but also the itemsets with high existence probabilities in an uncertain database based on the tuple uncertainty model. Expand
A Novel Approach for Mining High-Utility Sequential Patterns in Sequence Databases
This paper proposes a novel framework for mining high-utility sequential patterns for more real-life applicable information extraction from sequence databases with non-binary frequency values of items in sequences and different importance/significance values for distinct items. Expand