On efficient processing of subspace skyline queries on high dimensional data

Wen Jin, Anthony K.H. Tung, Martin Ester, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent studies on efficiently answering subspace skyline queries can be separated into two approaches. The first focused on pre-materializing a set of skylines points in various subspaces while the second focus on dynamically answering the queries by using a set of anchors to prune off skyline points through spatial reasoning. Despite effort to compress the pre-materialized subspace skylines through removal of redundancy, the storage space for the first approach remain exponential in the number of dimensions. The query time for the second approach on the other hand also grow substantially for data with higher dimensionality where the pruning power of anchors become much weaker. In this paper, we propose methods for answering subspace skyline query on high dimensional data such that both prematerialization storage and query time can be moderated. We propose novel notions of maximal partial-dominating space, maximal partial-dominated space and the maximal equality space between pairs of skyline objects in the full space and use these concepts as the foundation for answering subspace skyline queries for high dimensional data. Query processing involves mostly simple pruning operations while skyline computation is done only on a small subset of candidate skyline points in the subspace. We also develop a random sampling method to compute the subspace skyline in an on-line fashion. Extensive experiments have been conducted and demonstrated the efficiency and effectiveness of our methods.

Original languageEnglish (US)
Title of host publication19th International Conference on Scientific and Statistical Database Management, SSDBM 2007
DOIs
StatePublished - 2007
Event19th International Conference on Scientific and Statistical Database Management, SSDBM 2007 - Banff, AB, Canada
Duration: Jul 9 2007Jul 11 2007

Publication series

NameProceedings of the International Conference on Scientific and Statistical Database Management, SSDBM
ISSN (Print)1099-3371

Other

Other19th International Conference on Scientific and Statistical Database Management, SSDBM 2007
Country/TerritoryCanada
CityBanff, AB
Period7/9/077/11/07

ASJC Scopus subject areas

  • Software
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'On efficient processing of subspace skyline queries on high dimensional data'. Together they form a unique fingerprint.

Cite this