#### Filter Results:

#### Publication Year

1996

2016

#### Co-author

#### Key Phrase

#### Publication Venue

#### Data Set Used

Learn More

1 Introduction Spatial data mining, i.e., discovery of interesting characteristics and patterns that may implicitly exist in spatial databases, is a challenging task due to the huge amounts of spatial data and to the new conceptual nature of the problems which must account for spatial distance. Clustering and region oriented queries are common problems in… (More)

Clustering is the process of grouping a set of objects into classes of <i>similar</i> objects. Although definitions of similarity vary from one clustering model to another, in most of these models the concept of similarity is based on distances, e.g., Euclidean distance or cosine distance. In other words, similar objects are required to have close values on… (More)

Microarrays are one of the latest breakthroughs in experimental molecular biology, which provide a powerful tool by which the expression patterns of thousands of genes can be monitored simultaneously and are already producing huge amount of valuable data. The concept of bicluster was introduced by Cheng and Church (2000) to capture the coherence of a subset… (More)

- Jun Huan, Wei Wang, Jan Prins, Jiong Yang
- KDD
- 2004

One fundamental challenge for mining recurring subgraphs from semi-structured data sets is the overwhelming abundance of such patterns. In large graph databases, the total number of frequent subgraphs can become too large to allow a full enumeration using reasonable computational resources. In this paper, we propose a new algorithm that mines only… (More)

Frequent itemset mining is a popular and important first step in the analysis of data arising in a broad range of applications. The traditional " exact " model for frequent itemsets requires that every item occurs in each supporting transaction. Real data is typically subject to noise and measurement error. To date, the effects of noise on exact frequent… (More)

Clustering is the process of grouping a set of objects into classes of similar objects. Because of unknownness of the hidden patterns in the data sets, the definition of similarity is very subtle. Until recently, similarity measures are typically based on distances, e.g Euclidean distance and cosine distance. In this paper, we propose a flexible yet… (More)

Mining outliers in database is to find exceptional objects that deviate from the rest of the data set. Besides classical outlier analysis algorithms, recent studies have focused on mining local outliers, i.e., the outliers that have density distribution significantly different from their neighborhood. The estimation of density distribution at the location… (More)

We present new algorithms for performing fast computation of several common database operations on commodity graphics processors. Specifically, we consider operations such as conjunctive selections, aggregations, and semi-linear queries, which are essential computational components of typical database, data warehousing, and data mining applications. While… (More)

- Jinze Liu, Wei Wang, Jiong Yang
- CSB
- 2004

The soundness of clustering in the analysis of gene expression profiles and gene function prediction is based on the hypothesis that genes with similar expression profiles may imply strong correlations with their functions in the biological activities. Gene Ontology (GO) has become a well accepted standard in organizing gene function categories. Different… (More)

Frequent subgraph mining is an active research topic in the data mining community. A graph is a general model to represent data and has been used in many domains like cheminformatics and bioinformatics. Mining patterns from graph databases is challenging since graph related operations , such as subgraph testing, generally have higher time complexity than… (More)