gSpan: Graph-based substructure pattern mining

Xifeng Yan, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We investigate new approaches for frequent graph-based pattern mining in graph datasets and propose a novel algorithm called gSpan (graph-based Substructure pattern mining), which discovers frequent substructures without candidate generation. gSpan builds a new lexicographic order among graphs, and maps each graph to a unique minimum DFS code as its canonical label. Based on this lexicographic order, gSpan adopts the depth-first search strategy to mine frequent connected subgraphs efficiently. Our performance study shows thai gSpan substantially outperforms previous algorithms, sometimes by an order of magnitude.

Original languageEnglish (US)
Title of host publicationProceedings - 2002 IEEE International Conference on Data Mining, ICDM 2002
Pages721-724
Number of pages4
StatePublished - Dec 1 2002
Event2nd IEEE International Conference on Data Mining, ICDM '02 - Maebashi, Japan
Duration: Dec 9 2002Dec 12 2002

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
ISSN (Print)1550-4786

Other

Other2nd IEEE International Conference on Data Mining, ICDM '02
CountryJapan
CityMaebashi
Period12/9/0212/12/02

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint Dive into the research topics of 'gSpan: Graph-based substructure pattern mining'. Together they form a unique fingerprint.

  • Cite this

    Yan, X., & Han, J. (2002). gSpan: Graph-based substructure pattern mining. In Proceedings - 2002 IEEE International Conference on Data Mining, ICDM 2002 (pp. 721-724). (Proceedings - IEEE International Conference on Data Mining, ICDM).