• Publications
  • Influence
Mining frequent patterns without candidate generation
Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt anExpand
  • 5,102
  • 715
  • Open Access
Data Mining: Concepts and Techniques
The increasing volume of data in modern business and science calls for more complex and sophisticated tools. Although advances in data mining technology have made extensive data collection muchExpand
  • 13,680
  • 609
  • Open Access
Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach
Mining frequent patterns in transaction databases, time-series databases, and many other kinds of databases has been studied popularly in data mining research. Most of the previous studies adopt anExpand
  • 2,340
  • 253
  • Open Access
gSpan: graph-based substructure pattern mining
  • X. Yan, Jiawei Han
  • Computer Science
  • IEEE International Conference on Data Mining…
  • 9 December 2002
We investigate new approaches for frequent graph-based pattern mining in graph datasets and propose a novel algorithm called gSpan (graph-based substructure pattern mining), which discovers frequentExpand
  • 2,170
  • 238
  • Open Access
A Framework for Clustering Evolving Data Streams
The clustering problem is a difficult problem for the data stream domain. This is because the large volumes of data arriving in a stream renders most traditional algorithms too inefficient. In recentExpand
  • 1,707
  • 231
  • Open Access
CMAR: accurate and efficient classification based on multiple class-association rules
Previous studies propose that associative classification has high classification accuracy and strong flexibility at handling unstructured data. However, it still suffers from the huge set of minedExpand
  • 1,341
  • 214
  • Open Access
Data Mining: Concepts and Techniques, 3rd edition
The book Knowledge Discovery in Databases, edited by Piatetsky-Shapiro and Frawley [PSF91], is an early collection of research papers on knowledge discovery from data. The book Advances in KnowledgeExpand
  • 2,183
  • 179
  • Open Access
PathSim: Meta Path-Based Top-K Similarity Search in Heterogeneous Information Networks
Similarity search is a primitive operation in database and Web search engines. With the advent of large-scale heterogeneous information networks that consist of multi-typed, interconnected objects,Expand
  • 894
  • 162
  • Open Access
Semi-supervised Discriminant Analysis
Linear Discriminant Analysis (LDA) has been a popular method for extracting features which preserve class separability. The projection vectors are commonly obtained by maximizing the between classExpand
  • 652
  • 147
  • Open Access
CPAR: Classification based on Predictive Association Rules
Recent studies in data mining have proposed a new classification approach, called associative classification, which, according to several reports, such as [7, 6], achieves higher classificationExpand
  • 877
  • 136
  • Open Access