#### Filter Results:

- Full text PDF available (124)

#### Publication Year

1998

2017

- This year (1)
- Last 5 years (42)
- Last 10 years (91)

#### Publication Type

#### Co-author

#### Publication Venue

#### Data Set Used

#### Key Phrases

Learn More

- Feng Pan, Gao Cong, Anthony K. H. Tung, Jiong Yang, Mohammed J. Zaki
- KDD
- 2003

The growth of bioinformatics has resulted in datasets with new characteristics. These datasets typically contain a large number of columns and a small number of rows. For example, many gene expression datasets may contain 10,000-100,000 columns but only 100-1000 rows.Such datasets pose a great challenge for existing (closed) frequent pattern discovery… (More)

- Zhiping Zeng, Anthony K. H. Tung, Jianyong Wang, Jianhua Feng, Lizhu Zhou
- PVLDB
- 2009

Graph data have become ubiquitous and manipulating them based on similarity is essential for many applications. Graph edit distance is one of the most widely accepted measures to determine similarities between graphs and has extensive applications in the fields of pattern recognition, computer vision etc. Unfortunately, the problem of graph edit distance… (More)

In many decision-making applications, the skyline query is frequently used to find a set of dominating data points (called skyline points) in a multi-dimensional dataset. In a high-dimensional space skyline points no longer offer any interesting insights as there are too many of them. In this paper, we introduce a novel metric, called skyline frequency that… (More)

- Wen Jin, Anthony K. H. Tung, Jiawei Han, Wei Wang
- PAKDD
- 2006

Mining outliers in database is to find exceptional objects that deviate from the rest of the data set. Besides classical outlier analysis algorithms, recent studies have focused on mining local outliers, i.e., the outliers that have density distribution significantly different from their neighborhood. The estimation of density distribution at the location… (More)

- Chee Yong Chan, H. V. Jagadish, Kian-Lee Tan, Anthony K. H. Tung, Zhenjie Zhang
- SIGMOD Conference
- 2006

Given a <i>d</i>-dimensional data set, a point <i>p</i> dominates another point <i>q</i> if it is better than or equal to <i>q</i> in all dimensions and better than <i>q</i> in at least one dimension. A point is a skyline point if there does not exists any point that can dominate it. Skyline queries, which return skyline points, are useful in many decision… (More)

- Dongxiang Zhang, Yeow Meng Chee, Anirban Mondal, Anthony K. H. Tung, Masaru Kitsuregawa
- 2009 IEEE 25th International Conference on Data…
- 2009

This work addresses a novel spatial keyword query called the m-closest keywords (mCK) query. Given a database of spatial objects, each tuple is associated with some descriptive information represented in the form of keywords. The mCK query aims to find the spatially closest tuples which match m user-specified keywords. Given a set of keywords from a… (More)

- Gao Cong, Kian-Lee Tan, Anthony K. H. Tung, Xin Xu
- SIGMOD Conference
- 2005

In this paper, we propose a novel algorithm to discover the top-k covering rule groups for each row of gene expression profiles. Several experiments on real bioinformatics datasets show that the new top-k covering rule mining algorithm is orders of magnitude faster than previous association rule mining algorithms.Furthermore, we propose a new classification… (More)

- Jian Pei, Anthony K. H. Tung, Jiawei Han
- DMKD
- 2001

- Liping Ji, Kian-Lee Tan, Anthony K. H. Tung
- VLDB
- 2006

In this paper, we introduce the concept of frequent closed cube (FCC), which generalizes the notion of 2D frequent closed pattern to 3D context. We propose two novel algorithms to mine FCCs from 3D datasets. The first scheme is a Representative Slice Mining (RSM) framework that can be used to extend existing 2D FCP mining algorithms for FCC mining. The… (More)

- Gao Cong, Anthony K. H. Tung, Xin Xu, Feng Pan, Jiong Yang
- SIGMOD Conference
- 2004

Microarray datasets typically contain large number of columns but small number of rows. Association rules have been proved to be useful in analyzing such datasets. However, most existing association rule mining algorithms are unable to efficiently handle datasets with large number of columns. Moreover, the number of association rules generated from such… (More)