gSpan: Graph-Based Substructure Pattern Mining

Abstract

We investigate new approaches for frequent graph-based pattern mining in graph datasets and propose a novel algorithm called gSpan (graph-based Substructure pattern mining), which discovers frequent substructures without candidate generation. gSpan builds a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexico-graphic order, gSpan adopts the depth-first search strategy to mine frequent connected subgraphs efficiently. Our performance study shows that gSpan substantially outperforms previous algorithms, sometimes by an order of magnitude.

DOI: 10.1109/ICDM.2002.1184038

Extracted Key Phrases

6 Figures and Tables

Showing 1-10 of 975 extracted citations
0100200'03'05'07'09'11'13'15'17
Citations per Year

1,873 Citations

Semantic Scholar estimates that this publication has received between 1,667 and 2,104 citations based on the available data.

See our FAQ for additional information.