Top-down mining of frequent closed patterns from very high dimensional data

Hongyan Liu, Xiaoyu Wang, Jun He, Jiawei Han, Dong Xin, Zheng Shao

Research output: Contribution to journalArticlepeer-review

Abstract

Frequent pattern mining is an essential theme in data mining. Existing algorithms usually use a bottom-up search strategy. However, for very high dimensional data, this strategy cannot fully utilize the minimum support constraint to prune the rowset search space. In this paper, we propose a new method called top-down mining together with a novel row enumeration tree to make full use of the pruning power of the minimum support constraint. Furthermore, to efficiently check if a rowset is closed, we develop a method called the trace-based method. Based on these methods, an algorithm called TD-Close is designed for mining a complete set of frequent closed patterns. To enhance its performance further, we improve it by using new pruning strategies and new data structures that lead to a new algorithm TTD-Close. Our performance study shows that the top-down strategy is effective in cutting down search space and saving memory space, while the trace-based method facilitates the closeness-checking. As a result, the algorithm TTD-Close outperforms the bottom-up search algorithms such as Carpenter and FPclose in most cases. It also runs faster than TD-Close.

Original languageEnglish (US)
Pages (from-to)899-924
Number of pages26
JournalInformation Sciences
Volume179
Issue number7
DOIs
StatePublished - Mar 15 2009

Keywords

  • Association rules
  • Data mining
  • Frequent patterns
  • High dimensional data

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Theoretical Computer Science
  • Computer Science Applications
  • Information Systems and Management
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Top-down mining of frequent closed patterns from very high dimensional data'. Together they form a unique fingerprint.

Cite this