TY - GEN

T1 - Towards robust indexing for ranked queries

AU - Xin, Dong

AU - Chen, Chen

AU - Han, Jiawei

PY - 2006

Y1 - 2006

N2 - Top-k query asks for k tuples ordered according to a specific ranking function that combines the values from multiple participating attributes. The combined score function is usually linear. To efficiently answer top-k queries, preprocessing and indexing the data have been used to speed up the run time performance. Many indexing methods allow the online query algorithms progressively retrieve the data and stop at a certain point. However, in many cases, the number of data accesses is sensitive to the query parameters (i.e., linear weights in the score functions). In this paper, we study the sequentially layered indexing problem where tuples are put into multiple consecutive layers and any top-k query can be answered by at most k layers of tuples. We propose a new criterion for building the layered index. A layered index is robust if for any k, the number of tuples in the top k layers is minimal in comparison with all the other alternatives. The robust index guarantees the worst case performance for arbitrary query parameters. We derive a necessary and sufficient condition for robust index. The problem is shown solvable within O(n d log n) (where d is the number of dimensions, and n is the number of tuples). To reduce the high complexity of the exact solution, we develop an approximate approach, which has time complexity 0(2dn(Iog n) r(d)-1), where r(d) = [d/2] + [ d/2][d/2] Our experimental results show that our proposed method outperforms the best known previous methods.

AB - Top-k query asks for k tuples ordered according to a specific ranking function that combines the values from multiple participating attributes. The combined score function is usually linear. To efficiently answer top-k queries, preprocessing and indexing the data have been used to speed up the run time performance. Many indexing methods allow the online query algorithms progressively retrieve the data and stop at a certain point. However, in many cases, the number of data accesses is sensitive to the query parameters (i.e., linear weights in the score functions). In this paper, we study the sequentially layered indexing problem where tuples are put into multiple consecutive layers and any top-k query can be answered by at most k layers of tuples. We propose a new criterion for building the layered index. A layered index is robust if for any k, the number of tuples in the top k layers is minimal in comparison with all the other alternatives. The robust index guarantees the worst case performance for arbitrary query parameters. We derive a necessary and sufficient condition for robust index. The problem is shown solvable within O(n d log n) (where d is the number of dimensions, and n is the number of tuples). To reduce the high complexity of the exact solution, we develop an approximate approach, which has time complexity 0(2dn(Iog n) r(d)-1), where r(d) = [d/2] + [ d/2][d/2] Our experimental results show that our proposed method outperforms the best known previous methods.

UR - http://www.scopus.com/inward/record.url?scp=35448951221&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=35448951221&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:35448951221

SN - 1595933859

SN - 9781595933850

T3 - VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases

SP - 235

EP - 246

BT - VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases

PB - Association for Computing Machinery

T2 - 32nd International Conference on Very Large Data Bases, VLDB 2006

Y2 - 12 September 2006 through 15 September 2006

ER -