Parallel Algorithms for Discovery of Association Rules

@article{Zaki2004ParallelAF,
  title={Parallel Algorithms for Discovery of Association Rules},
  author={Mohammed J. Zaki and Srinivasan Parthasarathy and Mitsunori Ogihara and Wei Li},
  journal={Data Mining and Knowledge Discovery},
  year={2004},
  volume={1},
  pages={343-373}
}
Discovery of association rules is an important data mining task. Several parallel and sequential algorithms have been proposed in the literature to solve this problem. Almost all of these algorithms make repeated passes over the database to determine the set of frequent itemsets (a subset of database items), thus incurring high I/O overhead. In the parallel case, most algorithms perform a sum-reduction at the end of each pass to construct the global counts, also incurring high synchronization… 
Efficient mining of maximal frequent itemsets from databases on a cluster of workstations
TLDR
Both DMM and PMM demonstrate very good performance and scalability even when there are large maximal frequent itemsets in databases, and their performance was evaluated for various cases.
Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules
The discovery of association rules is an important task in data mining and knowledge discovery. Several algorithms have been developed for finding frequent itemsets and mining comprehensive
Distributed and Shared Memory Algorithm for Parallel Mining of Association Rules
TLDR
This paper presents a novel algorithm that exploits efficiently the trade-offs between computation, communication, memory usage and synchronization, and studies the effect of these variations on the overall performance.
New parallel algorithms for frequent itemset mining in very large databases
TLDR
New parallel algorithms for frequent itemset mining are presented and their efficiency is proven through a series of experiments on different parallel environments, that range from shared-memory multiprocessors machines to a set of SMP clusters connected together through a high speed network.
Fast Parallel Mining of Frequent Itemsets
Association rule mining has become an essential data mining technique in various fields and the massive growth of the available data demands more and more computational power. To address this issue,
AN ENHANCED SEMI-APRIORI ALGORITHM FOR MINING ASSOCIATION RULES 1
Mining association rules in large database is one of data mining and knowledge discovery research issue, although many algorithms have been designed to efficiently discover the frequent pattern and
Parallel Frequent Item Set Mining with Selective Item Replication
TLDR
A transaction database distribution scheme that divides the frequent item set mining task in a top-down fashion and is used in the design of two new parallel frequentitem set mining algorithms that replicate the items that correspond to the separator.
A Fast Algorithm Based on Apriori Algorithms to Explore the Set of Repetitive Items of Large Transaction Data
TLDR
A parallel algorithm is proposed to explore the collection of repetitive items from big and dense transaction databases by using the numerical candidate distribution together with the using of two processes at each level of parallel processing to increase the speed of extraction of data patterns.
Comparative Study of Apriori Algorithms for Parallel Mining of Frequent Itemsets
TLDR
Comparison study of parallel frequent itemsets mining algorithms addresses the issue of distributing the candidates among processors such that their counting and creation is effectively parallelized.
Parallel mining of association rules from text databases
TLDR
The new PMIHP algorithm is a parallel version of the Multipass with Inverted Hashing and Pruning algorithm, which was shown to be quite efficient than other existing algorithms in the context of mining text databases.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 28 REFERENCES
A localized algorithm for parallel association mining
TLDR
This paper describes a new parallel association mining algorithm that uses simple intersection operations to compute frequent itemsets and doesn’t have to maintain or search complex hash structures.
New Algorithms for Fast Discovery of Association Rules
TLDR
New algorithms for fast association mining, which scan the database only once, are presented, addressing the open question whether all the rules can be efficiently extracted in a single database pass.
Scalable parallel data mining for association rules
TLDR
The experimental results on a Cray T3D parallel computer show that the Hybrid Distribution algorithm scales linearly and exploits the aggregate memory better and can generate more association rules with a single scan of database per pass.
Efficient parallel data mining for association rules
TLDR
An algorithm, called PDM, to conduct parallel data mining for association rules, so designed that the global set of large itemsets can be identified efficiently and the amount of inter-node data exchange required is minimized.
Efficient Mining of Association Rules in Distributed Databases
TLDR
An efficient algorithm called DMA (Distributed Mining of Association rules), which generates a small number of candidate sets and requires only O(n) messages for support-count exchange for each candidate set, in distributed databases.
Parallel Data Mining for Association Rules on Shared-Memory Multi-Processors
TLDR
This paper presents parallel algorithms for data mining of association rules, and studies the degree of parallelism, synchronization, and data locality issues on the SGI Power Challenge shared-memory multi-processor.
A fast distributed algorithm for mining association rules
TLDR
An interesting distributed association rule mining algorithm, FDM (fast distributed mining of association rules), which generates a small number of candidate sets and substantially reduces the number of messages to be passed at mining association rules is proposed.
Set-oriented mining for association rules in relational databases
  • M. Houtsma, A. Swami
  • Computer Science
    Proceedings of the Eleventh International Conference on Data Engineering
  • 1995
TLDR
This paper shows that at least some aspects of data mining can be carried out by using general query languages such as SQL, rather than by developing specialized black-box algorithms.
An effective hash-based algorithm for mining association rules
TLDR
The number of candidate 2-itemsets generated by the proposed algorithm is, in orders of magnitude, smaller than that by previous methods, thus resolving the performance bottleneck, and allows us to effectively trim the transaction database size at a much earlier stage of the iterations, thereby reducing the computational cost for later iterations significantly.
Sampling Large Databases for Association Rules
TLDR
New algorithms that reduce the database activity considerably by picking a Random sample, to find using this sample all association rules that probably hold in the whole database, and then to verify the results with the rest of the database.
...
1
2
3
...