Computing iceberg cubes by top-down and bottom-up integration: The starcubing approach

Dong Xin, Jiawei Han, Xiaolei Li, Zheng Shao, Benjamin W. Wah

Research output: Contribution to journalArticlepeer-review

Abstract

Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down versus bottom-up. The former, represented by the MultiWay Array Cube (called the Multiway) algorithm [30], aggregates simultaneously on multiple dimensions; however, it cannot take advantage of a priori pruning [2] when computing iceberg cubes (cubes that contain only aggregate cells whose measure values satisfy a threshold, called the iceberg condition). The latter, represented by BUC [6], computes the iceberg cube bottom-up and facilitates a priori pruning. BUC explores fast sorting and partitioning techniques; however, it does not fully explore multidimensional simultaneous aggregation. In this paper, we present a new method, Star-Cubing, that integrates the strengths of the previous two algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-bys that do not satisfy the iceberg condition. Our performance study shows that Star-Cubing is highly efficient and outperforms the previous methods.

Original languageEnglish (US)
Pages (from-to)111-126
Number of pages16
JournalIEEE Transactions on Knowledge and Data Engineering
Volume19
Issue number1
DOIs
StatePublished - Jan 2007

Keywords

  • Data mining
  • Data warehouse
  • Online analytical processing (OLAP)

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint Dive into the research topics of 'Computing iceberg cubes by top-down and bottom-up integration: The starcubing approach'. Together they form a unique fingerprint.

Cite this