CLOSET+: Searching for the best strategies for mining frequent closed itemsets

Jianyong Wang, Jiawei Han, Jian Pei

Research output: Contribution to conferencePaperpeer-review

Abstract

Mining frequent closed itemsets provides complete and non-redundant results for frequent pattern analysis. Extensive studies have proposed various strategies for efficient frequent closed itemset mining, such as depth-first search vs. breadthfirst search, vertical formats vs. horizontal formats, tree-structure vs. other data structures, top-down vs. bottom-up traversal, pseudo projection vs. physical projection of conditional database, etc. It is the right time to ask "what are the pros and cons of the strategies?" and "what and how can we pick and integrate the best strategies to achieve higher performance in general cases?"In this study, we answer the above questions by a systematic study of the search strategies and develop a winning algorithm CLOSET+. CLOSET+ integrates the advantages of the previously proposed effective strategies as well as some ones newly developed here. A thorough performance study on synthetic and real data sets has shown the advantages of the strategies and the improvement of CLOSET+ over existing mining algorithms, including CLOSET, CHARM and OP, in terms of runtime, memory usage and scalability.

Original languageEnglish (US)
Pages236-245
Number of pages10
DOIs
StatePublished - 2003
Event9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03 - Washington, DC, United States
Duration: Aug 24 2003Aug 27 2003

Other

Other9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD '03
Country/TerritoryUnited States
CityWashington, DC
Period8/24/038/27/03

Keywords

  • Association rules
  • Frequent closed itemsets

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'CLOSET+: Searching for the best strategies for mining frequent closed itemsets'. Together they form a unique fingerprint.

Cite this