An Effective Uncertain Data Streams Top-K Query Algorithm
- Duan Mingyi, Lu Yinju
Frequent itemset mining is a common data mining task for many real-life applications. The mined frequent itemsets can be served as building blocks for various patterns including association rules and frequent sequences. Many existing algorithms mine for frequent itemsets from traditional static transaction databases, in which the contents of each transaction (namely, items) are definitely known and precise. However, there are many situations in which ones are uncertain about the contents of transactions. This calls for the mining of uncertain data. Moreover, there are also situations in which users are interested in only some portions of the mined frequent itemsets (i.e., itemsets satisfying user-specified constraints, which express the user interest). This leads to constrained mining. Furthermore, due to advances in technology, a flood of data can be produced in many situations. This calls for the mining of data streams. To deal with all these situations, we propose tree-based algorithms to efficiently mine streams of uncertain data for frequent itemsets that satisfy user-specified constraints.