TY - GEN
T1 - C-cubing
T2 - 22nd International Conference on Data Engineering, ICDE '06
AU - Xin, Dong
AU - Shao, Zheng
AU - Han, Jiawei
AU - Liu, Hongyan
N1 - Copyright:
Copyright 2009 Elsevier B.V., All rights reserved.
PY - 2006
Y1 - 2006
N2 - It is well recognized that data cubing often produces huge outputs. Two popular efforts devoted to this problem are (1) iceberg cube, where only significant cells are kept, and (2) closed cube, where a group of cells which preserve roll-up/drill-down semantics are losslessly compressed to one cell. Due to its usability and importance, efficient computation of closed cubes still warrants a thorough study. In this paper, we propose a new measure, called closedness, for efficient closed data cubing. We show that closedness is an algebraic measure and can be computed efficiently and incrementally. Based on closedness measure, we develop an an aggregation-based approach, called C-Cubing (i.e., Closed-Cubing), and integrate it into two successful iceberg cubing algorithms: MM-Cubing and Star-Cubing. Our performance study shows that C-Cubing runs almost one order of magnitude faster than the previous approaches. We further study how the performance of the alternative algorithms of C-Cubing varies w.r.t the properties of the data sets.
AB - It is well recognized that data cubing often produces huge outputs. Two popular efforts devoted to this problem are (1) iceberg cube, where only significant cells are kept, and (2) closed cube, where a group of cells which preserve roll-up/drill-down semantics are losslessly compressed to one cell. Due to its usability and importance, efficient computation of closed cubes still warrants a thorough study. In this paper, we propose a new measure, called closedness, for efficient closed data cubing. We show that closedness is an algebraic measure and can be computed efficiently and incrementally. Based on closedness measure, we develop an an aggregation-based approach, called C-Cubing (i.e., Closed-Cubing), and integrate it into two successful iceberg cubing algorithms: MM-Cubing and Star-Cubing. Our performance study shows that C-Cubing runs almost one order of magnitude faster than the previous approaches. We further study how the performance of the alternative algorithms of C-Cubing varies w.r.t the properties of the data sets.
UR - http://www.scopus.com/inward/record.url?scp=33749633463&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33749633463&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2006.31
DO - 10.1109/ICDE.2006.31
M3 - Conference contribution
AN - SCOPUS:33749633463
SN - 0769525709
SN - 9780769525709
T3 - Proceedings - International Conference on Data Engineering
SP - 4
BT - Proceedings of the 22nd International Conference on Data Engineering, ICDE '06
Y2 - 3 April 2006 through 7 April 2006
ER -