C-cubing: Efficient computation of closed cubes by aggregation-based checking

Dong Xin, Zheng Shao, Jiawei Han, Hongyan Liu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

It is well recognized that data cubing often produces huge outputs. Two popular efforts devoted to this problem are (1) iceberg cube, where only significant cells are kept, and (2) closed cube, where a group of cells which preserve roll-up/drill-down semantics are losslessly compressed to one cell. Due to its usability and importance, efficient computation of closed cubes still warrants a thorough study. In this paper, we propose a new measure, called closedness, for efficient closed data cubing. We show that closedness is an algebraic measure and can be computed efficiently and incrementally. Based on closedness measure, we develop an an aggregation-based approach, called C-Cubing (i.e., Closed-Cubing), and integrate it into two successful iceberg cubing algorithms: MM-Cubing and Star-Cubing. Our performance study shows that C-Cubing runs almost one order of magnitude faster than the previous approaches. We further study how the performance of the alternative algorithms of C-Cubing varies w.r.t the properties of the data sets.

Original languageEnglish (US)
Title of host publicationProceedings of the 22nd International Conference on Data Engineering, ICDE '06
Pages4
Number of pages1
DOIs
StatePublished - 2006
Event22nd International Conference on Data Engineering, ICDE '06 - Atlanta, GA, United States
Duration: Apr 3 2006Apr 7 2006

Publication series

NameProceedings - International Conference on Data Engineering
Volume2006
ISSN (Print)1084-4627

Other

Other22nd International Conference on Data Engineering, ICDE '06
CountryUnited States
CityAtlanta, GA
Period4/3/064/7/06

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Information Systems

Fingerprint Dive into the research topics of 'C-cubing: Efficient computation of closed cubes by aggregation-based checking'. Together they form a unique fingerprint.

Cite this