Efficient computation of iceberg cubes with complex measures

Jiawei Han, Jian Pei, Guozhu Dong, Ke Wang

Research output: Contribution to journalConference articlepeer-review

Abstract

It is often too expensive to compute and materialize a complete high-dimensional data cube. Computing an iceberg cube, which contains only aggregates above certain thresholds, is an effective way to derive nontrivial multidimensional aggregations for OLAP and data mining. In this paper, we study efficient methods for computing iceberg cubes with some popularly used complex measures, such as average, and develop a methodology that adopts a weaker but anti-monotonic condition for testing and pruning search space. In particular, for efficient computation of iceberg cubes with the average measure, we propose a top-k average pruning method and extend two previously studied methods, Apriori and BUC, to Top-k Apriori and Top-k BUC. To further improve the performance, an interesting hypertree structure, called H-tree, is designed and a new iceberg cubing method, called Top-k H-Cubing, is developed. Our performance study shows that Top-k BUC and Top-k H-Cubing are two promising candidates for scalable computation , and Top-k H-Cubing has better performance in most cases.

Original languageEnglish (US)
Pages (from-to)1-12
Number of pages12
JournalProceedings of the ACM SIGMOD International Conference on Management of Data
DOIs
StatePublished - 2001
Externally publishedYes
Event2001 ACM SIGMOD International Conference on Management of Data - Santa Barbara, CA, United States
Duration: May 21 2001May 24 2001

ASJC Scopus subject areas

  • Software
  • Information Systems

Fingerprint

Dive into the research topics of 'Efficient computation of iceberg cubes with complex measures'. Together they form a unique fingerprint.

Cite this