gSpan: Graph-Based Substructure Pattern Mining

Abstract

We investigate new approaches for frequent graph-based pattern mining in graph datasets and propose a novel algorithm called gSpan (graph-based Substructure pattern mining), which discovers frequent substructures without candidate generation. gSpan builds a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexicographic order, gSpan adopts the depth-first search strategy to mine frequent connected subgraphs efficiently. Our performance study shows that gSpan substantially outperforms previous algorithms, sometimes by an order of magnitude.

DOI: 10.1109/ICDM.2002.1184038

Extracted Key Phrases

5 Figures and Tables

0100200'03'05'07'09'11'13'15'17
Citations per Year

1,960 Citations

Semantic Scholar estimates that this publication has 1,960 citations based on the available data.

See our FAQ for additional information.

Cite this paper

@inproceedings{Yan2002gSpanGS, title={gSpan: Graph-Based Substructure Pattern Mining}, author={Xifeng Yan and Jiawei Han}, booktitle={ICDM}, year={2002} }