Pattern-growth methods

Jiawei Han, Jian Pei

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Mining frequent patterns has been a focused topic in data mining research in recent years, with the development of numerous interesting algorithms for mining association, correlation, causality, sequential patterns, partial periodicity, constraint-based frequent pattern mining, associative classification, emerging patterns, etc. Many studies adopt an Apriori-like, candidate generation-and-test approach. However, based on our analysis, candidate generation and test may still be expensive, especially when encountering long and numerous patterns. A new methodology, called frequent pattern growth, which mines frequent patterns without candidate generation, has been developed. The method adopts a divide-and-conquer philosophy to project and partition databases based on the currently discovered frequent patterns and grow such patterns to longer ones in the projected databases. Moreover, efficient data structures have been developed for effective database compression and fast in-memory traversal. Such a methodology may eliminate or substantially reduce the number of candidate sets to be generated and also reduce the size of the database to be iteratively examined, and, therefore, lead to high performance. In this paper, we provide an overview of this approach and examine its methodology and implications for mining several kinds of frequent patterns, including association, frequent closed itemsets, max-patterns, sequential patterns, and constraint-based mining of frequent patterns. We show that frequent pattern growth is efficient at mining large data-bases and its further development may lead to scalable mining of many other kinds of patterns as well.

Original languageEnglish (US)
Title of host publicationFrequent Pattern Mining
PublisherSpringer
Pages65-81
Number of pages17
Volume9783319078212
ISBN (Electronic)9783319078212
ISBN (Print)3319078208, 9783319078205
DOIs
StatePublished - Jul 1 2014

Keywords

  • Associations
  • Constraint-based mining FP-growth
  • Frequent patterns
  • Scalable data mining methods and algorithms
  • Sequential patterns

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Pattern-growth methods'. Together they form a unique fingerprint.

Cite this