Star-cubing: Computing iceberg cubes by top-down and bottom-up integration

Dong Xin, Jiawei Han, Xiaolei Li, Benjamin W. Wah

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down vs. bottomup. The former, represented by the Multi-Way Array Cube (called MultiWay) algorithm [25], aggregates simultaneously on multiple dimensions; however, it cannot take advantage of Apriori pruning [2] when computing iceberg cubes (cubes that contain only aggregate cells whose measure value satisfies a threshold, called iceberg condition). The latter, represented by two algorithms: BUC [6] and H-Cubing[11], computes the iceberg cube bottom-up and facilitates Apriori pruning. BUC explores fast sorting and partitioning techniques; whereas H-Cubing explores a data structure, H-Tree, for shared computation. However, none of them fully explores multi-dimensional simultaneous aggregation. In this paper, we present a new method, Star-Cubing, that integrates the strengths of the previous three algorithms and performs aggregations on multiple dimensions simultaneously. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-by's that do not satisfy the iceberg condition. Our performance study shows that Star-Cubing is highly efficient and outperforms all the previous methods in almost all kinds of data distributions.

Original languageEnglish (US)
Title of host publicationProceedings - 29th International Conference on Very Large Data Bases, VLDB 2003
EditorsPatricia G. Selinger, Michael J. Carey, Johann Christoph Freytag, Serge Abiteboul, Peter C. Lockemann, Andreas Heuer
PublisherMorgan Kaufmann
Pages476-487
Number of pages12
ISBN (Electronic)0127224424, 9780127224428
StatePublished - 2003
Event29th International Conference on Very Large Data Bases, VLDB 2003 - Berlin, Germany
Duration: Sep 9 2003Sep 12 2003

Publication series

NameProceedings - 29th International Conference on Very Large Data Bases, VLDB 2003

Other

Other29th International Conference on Very Large Data Bases, VLDB 2003
Country/TerritoryGermany
CityBerlin
Period9/9/039/12/03

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Hardware and Architecture
  • Information Systems and Management
  • Computer Science Applications
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Star-cubing: Computing iceberg cubes by top-down and bottom-up integration'. Together they form a unique fingerprint.

Cite this