# Dynamic itemset counting and implication rules for market basket data

@inproceedings{Brin1997DynamicIC, title={Dynamic itemset counting and implication rules for market basket data}, author={Sergey Brin and Rajeev Motwani and Jeffrey D. Ullman and Shalom Tsur}, booktitle={SIGMOD '97}, year={1997} }

We consider the problem of analyzing market-basket data and present several important contributions. [...] Key Method Second, we present a new way of generating “implication rules,” which are normalized based on both the antecedent and the consequent and are truly implications (not simply a measure of co-occurrence), and we show how they produce more intuitive results than other methods. Finally, we show how different characteristics of real data, as opposed by synthetic data, can dramatically affect the… Expand

## Figures, Tables, and Topics from this paper

## 2,234 Citations

A new framework for itemset generation

- Computer SciencePODS '98
- 1998

An algorithm is provided which provides very good computational efficiency, while maintaining statistical robuetneas, which implies that the method can be applied to find association rules in datasets in which items may appear in a sizeable percentage of the transactions (dense datasets), dataset in which the items have varying density, or even negative association rules.

Mining Market Basket Data Using Share Measures and Characterized Itemsets

- Computer SciencePAKDD
- 1998

The share-confidence framework for knowledge discovery from databases is proposed which addresses the problem of mining itemsets from market basket data and suggests how characterized itemsets can be generalized according to concept hierarchies associated with the characteristic attributes.

Mining Associations with the Collective Strength Approach

- Computer ScienceIEEE Trans. Knowl. Data Eng.
- 2001

An algorithm is provided which provides very good computational efficiency, while maintaining statistical robustness, and the fact that this algorithm relies on relative measures rather than absolute measures such as support implies that the method can be applied to find association rules in data sets in which items may appear in a sizeable percentage of the transactions.

A New Approach for the Discovery of Frequent Itemsets

- Computer ScienceDaWaK
- 1999

An algorithm that requires only one pass on the database, presents linear scale-up property with the dimensions of the database and, as shown by the experiments, performs better than other classical algorithms.

Mining frequent itemsets in data streams within a time horizon

- Computer ScienceData Knowl. Eng.
- 2014

The experimental results prove that the proposed algorithm for mining frequent itemsets in a stream of transactions within a limited time horizon is faster than other approaches but has a slightly higher cost in terms of memory.

Scalable APRIORI-Based Frequent Pattern Discovery

- Computer Science2009 International Conference on Computational Science and Engineering
- 2009

This paper takes the classic algorithm for the frequent pattern discovery problem, A Priori, and by adding a vertical sort drastically improve its performance characteristics when processing very large datasets.

UNIC : UNique Item Counts for Association Rule Mining in Relational Data

- 2004

Association rule mining (ARM) can be generalized to relational data by using joined relations as basis. We demonstrate that typically such an approach results in an overwhelming number of rules that…

Advances in Mining Binary Data: Itemsets as Summaries

- Computer Science
- 2008

This thesis shows how to use itemsets for answering queries, that is, finding out the number of transactions satisfying some given formula, and proposes a new concept called normalised correlation dimension, a known concept that works well with realvalued data.

Market basket analysis with networks

- Computer ScienceSocial Network Analysis and Mining
- 2010

It is demonstrated that the network based approach can concisely isolate influence among products, mitigating the need to search through massive lists of association rules, and an interestingness measure for communities of products is developed and shown to isolates useful, actionable communities.

Extracting Share Frequent Itemsets with Infrequent Subsets

- Computer ScienceData Mining and Knowledge Discovery
- 2004

This work defines the problem of finding share frequent itemsets, and shows that share frequency does not have the property of downward closure when it is defined in terms of the itemset as a whole.

## References

SHOWING 1-10 OF 11 REFERENCES

Fast Algorithms for Mining Association Rules

- Computer ScienceVLDB 1994
- 1994

Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

Fast algorithms for mining association rules

- Computer ScienceVLDB 1998
- 1998

Two new algorithms for solving thii problem that are fundamentally different from the known algorithms are presented and empirical evaluation shows that these algorithms outperform theknown algorithms by factors ranging from three for small problems to more than an order of magnitude for large problems.

Mining sequential patterns

- Computer ScienceProceedings of the Eleventh International Conference on Data Engineering
- 1995

Three algorithms are presented to solve the problem of mining sequential patterns over databases of customer transactions, and empirically evaluating their performance using synthetic data shows that two of them have comparable performance.

Sampling Large Databases for Association Rules

- Computer ScienceVLDB
- 1996

New algorithms that reduce the database activity considerably by picking a Random sample, to find using this sample all association rules that probably hold in the whole database, and then to verify the results with the rest of the database.

Mining generalized association rules

- Computer ScienceFuture Gener. Comput. Syst.
- 1997

A new interest-measure for rules which uses the information in the taxonomy is presented, and given a user-specified “minimum-interest-level”, this measure prunes a large number of redundant rules.

Mining association rules between sets of items in large databases

- Computer ScienceSIGMOD '93
- 1993

An efficient algorithm is presented that generates all significant association rules between items in the database of customer transactions and incorporates buffer management and novel estimation and pruning techniques.

SLIQ: A Fast Scalable Classifier for Data Mining

- Computer ScienceEDBT
- 1996

Issues in building a scalable classifier are discussed and the design of SLIQ, a new classifier that uses a novel pre-sorting technique in the tree-growth phase to enable classification of disk-resident datasets is presented.

Database Mining: A Performance Perspective

- Computer ScienceIEEE Trans. Knowl. Data Eng.
- 1993

The authors' perspective of database mining as the confluence of machine learning techniques and the performance emphasis of database technology is presented and an algorithm for classification obtained by combining the basic rule discovery operations is given.

Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases

- Computer ScienceVLDB
- 1995

We introduce a new model of similarity of time sequences that captures the intuitive notion that two sequences should be considered similar if they have enough non-overlapping time-ordered pairs of…

Proc. of the Int'1 Conf. on Very Large Data Bases (VLDB)

- Proc. of the Int'1 Conf. on Very Large Data Bases (VLDB)
- 1995