Mining frequent itemsets with convertible constraints

@article{Pei2001MiningFI,
  title={Mining frequent itemsets with convertible constraints},
  author={Jian Pei and Jiawei Han and Laks V. S. Lakshmanan},
  journal={Proceedings 17th International Conference on Data Engineering},
  year={2001},
  pages={433-442}
}
Recent work has highlighted the importance of the constraint based mining paradigm in the context of frequent itemsets, associations, correlations, sequential patterns, and many other interesting patterns in large databases. The authors study constraints which cannot be handled with existing theory and techniques. For example, avg(S) /spl theta/ /spl nu/, median(S) /spl theta/ /spl nu/, sum(S) /spl theta/ /spl nu/ (S can contain items of arbitrary values) (/spl theta//spl isin/{/spl ges/, /spl… 
Pushing Convertible Constraints in Frequent Itemset Mining
TLDR
A notion of convertible constraints is developed and systematically analyze, classify, and characterize this class and techniques which enable them to be readily pushed deep inside the recently developed FP-growth algorithm for frequent itemset mining are developed.
Efficient mining for frequent itemsets with multiple convertible constraints
  • Bao-Lisong, Zhen Qin
  • Computer Science
    2005 International Conference on Machine Learning and Cybernetics
  • 2005
TLDR
This paper studies multiple convertible constraints and develops technique which enable them to be readily pushed deep inside a algorithm for frequent itemsets mining.
Pushing Constraints into a Pattern-Tree
TLDR
This work proposes a set of strategies to push each type of constraint into pattern mining through the use of a pattern-tree structure to efficiently store, check and prune the patterns.
Mining free itemsets under constraints
TLDR
The authors show that the benefit of user defined constraints and closed set mining can be combined into levelwise algorithms, and an experimental validation related to the discovery of association rules with negations is reported.
Fast generation of sequential patterns with item constraints from concise representations
TLDR
The proposed MFS-IC algorithm outperforms state-of-the-art algorithms, which directly mine frequent sequences with constraints from an SDB, in terms of runtime, memory usage and scalability.
Definition 2 ( Constrained Frequent Itemset Mining ) A constraint on itemsets is a function C : 2
The constraint-based pattern discovery paradigm was introduced with the aim of providing to the user a tool to drive the discovery process towards potentially interesting patterns, with the positive
Structure of frequent itemsets with extended double constraints
TLDR
All theoretical results in this paper are proven to be reliable and they are firm bases to guarantee the correctness and efficiency of a new algorithm, MFS-EDC, which is used to effectively mine all constrained frequent itemsets.
Constraint-Based Discovery of a Condensed Representation for Frequent Patterns
Computing frequent itemsets and their frequencies from large boolean matrices (e.g., to derive association rules) has been one of the hot topics in data mining. Levelwise algorithms (e.g., the
Bifold constraint-based mining by simultaneous monotone and anti-monotone checking
TLDR
This paper proposes an approach that allows the efficient mining of frequent item sets patterns, while pushing simultaneously both monotone and anti-monotone constraints during and at different strategic stages of the mining process.
Constraint-Based Graph Mining in Large Database
TLDR
Experimental results show that CabGin can prunes a large search space effectively by pushing graph-based constraints into mining process and develop a framework CabGIn to push various constraints systematically into graph mining process by their categories.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 13 REFERENCES
CLOSET: An Efficient Algorithm for Mining Frequent Closed Itemsets
TLDR
An e cient algorithm, CLOSET, for mining closed itemsets is proposed, with the development of three techniques: applying a compressed, frequent pattern tree FP-tree structure for miningclosed itemsets without candidate generation, and developing a single pre x path compression technique to identify frequent closed itemset quickly.
Mining frequent patterns without candidate generation
TLDR
This study proposes a novel frequent pattern tree (FP-tree) structure, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and develops an efficient FP-tree-based mining method, FP-growth, for mining the complete set of frequent patterns by pattern fragment growth.
Optimization of constrained frequent set queries with 2-variable constraints
TLDR
A notion of quasi-succinctness is introduced, which allows a quasi-Succinct 2-var constraint to be reduced to two succinct 1-var constraints for pruning, and a query optimizer is proposed that is ccc-optimal, i.e., minimizing the effort incurred w.r.t. constraint checking and support counting.
Exploratory mining and pruning optimizations of constrained associations rules
TLDR
An architecture that opens up the black-box, and supports constraint-based, human-centered exploratory mining of associations, and introduces and analyzes two properties of constraints that are critical to pruning: anti-monotonicity and succinctness.
Mining Association Rules with Item Constraints
TLDR
This work considers the problem of integrating constraints that are Boolean expressions over the presence or absence of items into the association discovery algorithm and presents three integrated algorithms for mining association rules with item constraints and discusses their tradeoffs.
Constraint-based rule mining in large, dense databases
TLDR
A new algorithm is described that directly exploits all user-specified constraints including minimum support, minimum confidence, and a new constraint that ensures every mined rule offers a predictive advantage over any of its simplifications.
Beyond market baskets: generalizing association rules to correlations
TLDR
This work develops the notion of mining rules that identify correlations (generalizing associations), and proposes measuring significance of associations via the chi-squared test for correlation from classical statistics, enabling the mining problem to reduce to the search for a border between correlated and uncorrelated itemsets in the lattice.
Fast algorithms for mining association rules
TLDR
Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.
Efficient mining of constrained correlated sets
TLDR
It turns out that constraints can have subtle interactions with correlated item sets, depending on their underlying properties, and this work delineates the meaning of these two spaces and gives algorithms for computing them.
Can we push more constraints into frequent pattern mining?
Permission to make digital or hard copies of part or all of this work or personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial
...
1
2
...