Mining frequent item sets by opportunistic projection

Junqiang Liu, Yunhe Pan, Ke Wang, Jiawei Han

Research output: Contribution to conferencePaper

Abstract

In this paper, we present a novel algorithm OpportuneProject for mining complete set of frequent item sets by projecting databases to grow a frequent item set tree. Our algorithm is fundamentally different from those proposed in the past in that it opportunistically chooses between two different structures, array-based or tree-based, to represent projected transaction subsets, and heuristically decides to build unfiltered pseudo projection or to make a filtered copy according to features of the subsets. More importantly, we propose novel methods to build tree-based pseudo projections and array-based unfiltered projections for projected transaction subsets, which makes our algorithm both CPU time efficient and memory saving. Basically, the algorithm grows the frequent item set tree by depth first search, whereas breadth first search is used to build the upper portion of the tree if necessary. We test our algorithm versus several other algorithms on real world datasets, such as BMS-POS, and on IBM artificial datasets. The empirical results show that our algorithm is not only the most efficient on both sparse and dense databases at all levels of support threshold, but also highly scalable to very large databases.

Original languageEnglish (US)
Pages229-238
Number of pages10
StatePublished - Dec 1 2002
EventKDD - 2002 Proceedings of the Eight ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - Edmonton, Alta, Canada
Duration: Jul 23 2002Jul 26 2002

Other

OtherKDD - 2002 Proceedings of the Eight ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
CountryCanada
CityEdmonton, Alta
Period7/23/027/26/02

    Fingerprint

Keywords

  • Association rules
  • Frequent patterns

ASJC Scopus subject areas

  • Software
  • Information Systems

Cite this

Liu, J., Pan, Y., Wang, K., & Han, J. (2002). Mining frequent item sets by opportunistic projection. 229-238. Paper presented at KDD - 2002 Proceedings of the Eight ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alta, Canada.