High-Dimensional OLAP

Xiaolei Li, Jiawei Han, Hector Gonzalez

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

This chapter proposes a novel approach for online analytical processing (OLAP) in high-dimensional datasets with a moderate number of tuples. Data cube is playing an essential role in fast OLAP in many multi-dimensional data warehouses. There exist data sets in applications, such as bioinformatics, statistics, and text processing, that are characterized by high dimensionality and moderate size. No feasible data cube can be constructed with such data sets. Data analysis tasks may involve a high dimensional space, but most OLAP operations are performed only on a small number of dimensions at a time. Using inverted indices and pre-aggregated results, OLAP queries are computed online by dynamically constructing cuboids from the fragment data cubes. With this design, for high-dimensional OLAPing, the total space that needs to store such shell-fragments is negligible in comparison with a high-dimensional cube, so is the online computation overhead. The investigations exhibit that the storage cost grows linearly with the number dimensions. Moreover, the query I/O costs for large data sets are reasonable and are comparable with solutions from a materialized data cube, if such a cube is available.

Original languageEnglish (US)
Title of host publicationProceedings 2004 VLDB Conference
Subtitle of host publicationThe 30th International Conference on Very Large Databases (VLDB)
PublisherElsevier
Pages528-539
Number of pages12
ISBN (Electronic)9780120884698
DOIs
StatePublished - Jan 1 2004

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'High-Dimensional OLAP'. Together they form a unique fingerprint.

Cite this