Abstract
Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down vs. bottom-up. For efficient cube computation in various data distributions, this chapter proposes an interesting cube computation method, Star-Cubing, that integrates the strength of both top-down and bottom-up cube computation, and explores a few additional optimization techniques. It utilizes a star-tree structure, extends the simultaneous aggregation methods, and enables the pruning of the group-by’s that do not satisfy the iceberg condition. The performance study shows that Star-Cubing is highly efficient and outperforms all the previous methods in almost all kinds of data distributions. Two optimization techniques are emphasized: (1) shared aggregation by taking advantage of shared dimensions among the current cuboid and its descendant cuboids; and (2) prune as soon as possible the unpromising cells during the cube computation using the anti-monotonic property of the iceberg cube measure. No previous cubing method has fully explored both optimization methods in one algorithm. Moreover, a new compressed data structure, star-tree, is proposed using star nodes. In addition, a few other optimization techniques contribute to the high performance of the method.
Original language | English (US) |
---|---|
Title of host publication | Proceedings 2003 VLDB Conference |
Subtitle of host publication | 29th International Conference on Very Large Databases (VLDB) |
Publisher | Elsevier |
Pages | 476-487 |
Number of pages | 12 |
ISBN (Electronic) | 9780127224428 |
DOIs | |
State | Published - Jan 1 2003 |
ASJC Scopus subject areas
- General Computer Science