Abstract

Mining frequent patterns has been a focused topic in data mining research in recent years, with the development of numerous interesting algorithms for mining association, correlation, causality, sequential patterns, partial periodicity, constraint-based frequent pattern mining, associative classification, emerging patterns, etc. Many studies adopt an Apriori-like, candidate generation-and-test approach. However, based on our analysis, candidate generation and test may still be expensive, especially when encountering long and numerous patterns. A new methodology, called frequent pattern growth, which mines frequent patterns without candidate generation, has been developed. The method adopts a divide-and-conquer philosophy to project and partition databases based on the currently discovered frequent patterns and grow such patterns to longer ones in the projected databases. Moreover, efficient data structures have been developed for effective database compression and fast in-memory traversal. Such a methodology may eliminate or substantially reduce the number of candidate sets to be generated and also reduce the size of the database to be iteratively examined, and, therefore, lead to high performance. In this paper, we provide an overview of this approach and examine its methodology and implications for mining several kinds of frequent patterns, including association, frequent closed itemsets, max-patterns, sequential patterns, and constraint-based mining of frequent patterns. We show that frequent pattern growth is efficient at mining large data-bases and its further development may lead to scalable mining of many other kinds of patterns as well.

Original languageEnglish (US)
Title of host publicationFrequent Pattern Mining
PublisherSpringer International Publishing
Pages65-81
Number of pages17
Volume9783319078212
ISBN (Electronic)9783319078212
ISBN (Print)3319078208, 9783319078205
DOIs
StatePublished - Jul 1 2014

Fingerprint

Pattern recognition
Data mining
Data structures
Data storage equipment

Keywords

  • Associations
  • Constraint-based mining FP-growth
  • Frequent patterns
  • Scalable data mining methods and algorithms
  • Sequential patterns

ASJC Scopus subject areas

  • Computer Science(all)

Cite this

Han, J., & Pei, J. (2014). Pattern-growth methods. In Frequent Pattern Mining (Vol. 9783319078212, pp. 65-81). Springer International Publishing. https://doi.org/10.1007/978-3-319-07821-2_3

Pattern-growth methods. / Han, Jiawei; Pei, Jian.

Frequent Pattern Mining. Vol. 9783319078212 Springer International Publishing, 2014. p. 65-81.

Research output: Chapter in Book/Report/Conference proceedingChapter

Han, J & Pei, J 2014, Pattern-growth methods. in Frequent Pattern Mining. vol. 9783319078212, Springer International Publishing, pp. 65-81. https://doi.org/10.1007/978-3-319-07821-2_3
Han J, Pei J. Pattern-growth methods. In Frequent Pattern Mining. Vol. 9783319078212. Springer International Publishing. 2014. p. 65-81 https://doi.org/10.1007/978-3-319-07821-2_3
Han, Jiawei ; Pei, Jian. / Pattern-growth methods. Frequent Pattern Mining. Vol. 9783319078212 Springer International Publishing, 2014. pp. 65-81
@inbook{daa69aa20c034ee09af6b2abb69a945b,
title = "Pattern-growth methods",
abstract = "Mining frequent patterns has been a focused topic in data mining research in recent years, with the development of numerous interesting algorithms for mining association, correlation, causality, sequential patterns, partial periodicity, constraint-based frequent pattern mining, associative classification, emerging patterns, etc. Many studies adopt an Apriori-like, candidate generation-and-test approach. However, based on our analysis, candidate generation and test may still be expensive, especially when encountering long and numerous patterns. A new methodology, called frequent pattern growth, which mines frequent patterns without candidate generation, has been developed. The method adopts a divide-and-conquer philosophy to project and partition databases based on the currently discovered frequent patterns and grow such patterns to longer ones in the projected databases. Moreover, efficient data structures have been developed for effective database compression and fast in-memory traversal. Such a methodology may eliminate or substantially reduce the number of candidate sets to be generated and also reduce the size of the database to be iteratively examined, and, therefore, lead to high performance. In this paper, we provide an overview of this approach and examine its methodology and implications for mining several kinds of frequent patterns, including association, frequent closed itemsets, max-patterns, sequential patterns, and constraint-based mining of frequent patterns. We show that frequent pattern growth is efficient at mining large data-bases and its further development may lead to scalable mining of many other kinds of patterns as well.",
keywords = "Associations, Constraint-based mining FP-growth, Frequent patterns, Scalable data mining methods and algorithms, Sequential patterns",
author = "Jiawei Han and Jian Pei",
year = "2014",
month = "7",
day = "1",
doi = "10.1007/978-3-319-07821-2_3",
language = "English (US)",
isbn = "3319078208",
volume = "9783319078212",
pages = "65--81",
booktitle = "Frequent Pattern Mining",
publisher = "Springer International Publishing",

}

TY - CHAP

T1 - Pattern-growth methods

AU - Han, Jiawei

AU - Pei, Jian

PY - 2014/7/1

Y1 - 2014/7/1

N2 - Mining frequent patterns has been a focused topic in data mining research in recent years, with the development of numerous interesting algorithms for mining association, correlation, causality, sequential patterns, partial periodicity, constraint-based frequent pattern mining, associative classification, emerging patterns, etc. Many studies adopt an Apriori-like, candidate generation-and-test approach. However, based on our analysis, candidate generation and test may still be expensive, especially when encountering long and numerous patterns. A new methodology, called frequent pattern growth, which mines frequent patterns without candidate generation, has been developed. The method adopts a divide-and-conquer philosophy to project and partition databases based on the currently discovered frequent patterns and grow such patterns to longer ones in the projected databases. Moreover, efficient data structures have been developed for effective database compression and fast in-memory traversal. Such a methodology may eliminate or substantially reduce the number of candidate sets to be generated and also reduce the size of the database to be iteratively examined, and, therefore, lead to high performance. In this paper, we provide an overview of this approach and examine its methodology and implications for mining several kinds of frequent patterns, including association, frequent closed itemsets, max-patterns, sequential patterns, and constraint-based mining of frequent patterns. We show that frequent pattern growth is efficient at mining large data-bases and its further development may lead to scalable mining of many other kinds of patterns as well.

AB - Mining frequent patterns has been a focused topic in data mining research in recent years, with the development of numerous interesting algorithms for mining association, correlation, causality, sequential patterns, partial periodicity, constraint-based frequent pattern mining, associative classification, emerging patterns, etc. Many studies adopt an Apriori-like, candidate generation-and-test approach. However, based on our analysis, candidate generation and test may still be expensive, especially when encountering long and numerous patterns. A new methodology, called frequent pattern growth, which mines frequent patterns without candidate generation, has been developed. The method adopts a divide-and-conquer philosophy to project and partition databases based on the currently discovered frequent patterns and grow such patterns to longer ones in the projected databases. Moreover, efficient data structures have been developed for effective database compression and fast in-memory traversal. Such a methodology may eliminate or substantially reduce the number of candidate sets to be generated and also reduce the size of the database to be iteratively examined, and, therefore, lead to high performance. In this paper, we provide an overview of this approach and examine its methodology and implications for mining several kinds of frequent patterns, including association, frequent closed itemsets, max-patterns, sequential patterns, and constraint-based mining of frequent patterns. We show that frequent pattern growth is efficient at mining large data-bases and its further development may lead to scalable mining of many other kinds of patterns as well.

KW - Associations

KW - Constraint-based mining FP-growth

KW - Frequent patterns

KW - Scalable data mining methods and algorithms

KW - Sequential patterns

UR - http://www.scopus.com/inward/record.url?scp=84930327032&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84930327032&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-07821-2_3

DO - 10.1007/978-3-319-07821-2_3

M3 - Chapter

AN - SCOPUS:84930327032

SN - 3319078208

SN - 9783319078205

VL - 9783319078212

SP - 65

EP - 81

BT - Frequent Pattern Mining

PB - Springer International Publishing

ER -