BAHUI: Fast and Memory Efficient Mining of High Utility Itemsets Based on Bitmap

@article{Song2014BAHUIFA,
  title={BAHUI: Fast and Memory Efficient Mining of High Utility Itemsets Based on Bitmap},
  author={Wei Song and Yu Liu and Jinhong Li},
  journal={Int. J. Data Warehous. Min.},
  year={2014},
  volume={10},
  pages={1-15}
}
Mining high utility itemsets is one of the most important research issues in data mining owing to its ability to consider nonbinary frequency values of items in transactions and different profit values for each item. Although a number of relevant approaches have been proposed in recent years, they incur the problem of producing a large number of candidate itemsets for high utility itemsets. In this paper, the authors propose an efficient algorithm, namely BAHUI Bitmap-based Algorithm for High… 

Figures and Tables from this paper

EFIM: a fast and memory efficient algorithm for high-utility itemset mining
TLDR
A novel algorithm named EFIM (EFficient high-utility Itemset Mining), which introduces several new ideas to more efficiently discover high-UTility itemsets and is in general two to three orders of magnitude faster than the state-of-art algorithms.
EFIM: A Highly Efficient Algorithm for High-Utility Itemset Mining
TLDR
An extensive experimental study on various datasets shows that EFIM is in general two to three orders of magnitude faster and consumes up to eight times less memory than the state-of-art algorithms d\(^2\)HUP, HUI-Miner, HUP-M Miner, FHM and UP-Growth+.
High-Utility Itemset Mining with Effective Pruning Strategies
TLDR
Two new stricter upper bounds are designed to reduce the computation time by refraining from visiting unnecessary nodes of an itemset, so that the search space of the potential HUIs can be greatly reduced, and the mining procedure of the execution time can be improved.
FCHUIM: Efficient Frequent and Closed High-Utility Itemsets Mining
TLDR
An efficient algorithm called FCHUIM is used to mine a concise representation called the frequent closed high-utility itemset (FCHUI), which is based on a total summary list structure for storing and retrieving utility lists without repeatedly scanning the database.
FHM + : Faster High-Utility Itemset Mining Using Length Upper-Bound Reduction
TLDR
To discover HUIs efficiently with length constraints, FHM\(+\) introduces the concept of Length Upper-Bound Reduction (LUR), and two novel upper-bounds on the utility of itemsets, and shows that length constraints are effective at reducing the number of patterns.
CLS-Miner: efficient and effective closed high-utility itemset mining
TLDR
This paper proposes a novel algorithm called CLS-Miner, which utilizes the utility-list structure to directly compute the utilities of itemsets without producing candidates, and introduces three novel strategies to reduce the search space, namely chain-estimated utility co-occurrence pruned, lower branch pruning, and pruning by coverage.
Optimized High-Utility Itemsets Mining for Effective Association Mining Paper
TLDR
Experimental results show that OFHM takes less computational runtime, therefore it is more efficient when compared to other existing methods for benchmarked large datasets.
An efficient algorithm for mining top-k on-shelf high utility itemsets
TLDR
An efficient algorithm named KOSHU (fast top-K on- shelf high utility itemset miner) is proposed to mine the top-k HOUs efficiently, while considering on-shelf time periods of items, and items having positive and/or negative unit profits.
A New Algorithm for High Average-utility Itemset Mining
TLDR
A new algorithm is proposed which uses a new list structure and pruning strategy which outperforms the state-of-the-art HAUIM algorithms in terms of runtime and memory consumption.
...
...

References

SHOWING 1-10 OF 51 REFERENCES
UP-Growth: an efficient algorithm for high utility itemset mining
TLDR
The experimental results show that UP-Growth not only reduces the number of candidates effectively but also outperforms other algorithms substantially in terms of execution time, especially when the database contains lots of long transactions.
A Projection-Based Approach for Discovering High Average-Utility Itemsets
TLDR
An efficient average-utility mining approach which adopts a projection technique and an indexing mechanism to speed up the execution and reduce the memory requirement in the mining process is proposed.
CTU-Mine: An Efficient High Utility Itemset Mining Algorithm Using the Pattern Growth Approach
TLDR
A new algorithm called CTU-Mine is proposed that mines high utility itemsets using the pattern growth approach, suitable for sparse data sets with short patterns, but not feasible for dense data sets or long patterns.
Efficient Tree Structures for High Utility Pattern Mining in Incremental Databases
TLDR
This paper proposes three novel tree structures to efficiently perform incremental and interactive HUP mining that can capture the incremental data without any restructuring operation, and shows that these tree structures are very efficient and scalable.
A Hybrid Method for High-Utility Itemsets Mining in Large High-Dimensional Data
TLDR
A hybrid model and a row enumeration-based algorithm, i.e., Inter-transaction, to discover high-utility itemsets from two directions, which makes full use of the characteristic that there are few common items between or among long transactions.
A Two-Phase Algorithm for Fast Discovery of High Utility Itemsets
TLDR
This paper presents a Two-Phase algorithm to efficiently prune down the number of candidates and precisely obtain the complete set of high utility itemsets on synthetic and real databases.
Extracting Share Frequent Itemsets with Infrequent Subsets
TLDR
This work defines the problem of finding share frequent itemsets, and shows that share frequency does not have the property of downward closure when it is defined in terms of the itemset as a whole.
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
TLDR
A novel frequent-pattern tree (FP-tree) structure is proposed, which is an extended prefix-tree structure for storing compressed, crucial information about frequent patterns, and an efficient FP-tree-based mining method, FP-growth, is developed for mining the complete set of frequent patterns by pattern fragment growth.
Mining high utility itemsets
TLDR
A new pruning strategy based on utilities that allow pruning of low utility itemsets to be done by means of a weaker but antimonotonic condition is developed and shows that it does not require a user specified minimum utility and hence is effective in practice.
...
...