TY - GEN
T1 - Towards robust indexing for ranked queries
AU - Xin, Dong
AU - Chen, Chen
AU - Han, Jiawei
PY - 2006
Y1 - 2006
N2 - Top-k query asks for k tuples ordered according to a specific ranking function that combines the values from multiple participating attributes. The combined score function is usually linear. To efficiently answer top-k queries, preprocessing and indexing the data have been used to speed up the run time performance. Many indexing methods allow the online query algorithms progressively retrieve the data and stop at a certain point. However, in many cases, the number of data accesses is sensitive to the query parameters (i.e., linear weights in the score functions). In this paper, we study the sequentially layered indexing problem where tuples are put into multiple consecutive layers and any top-k query can be answered by at most k layers of tuples. We propose a new criterion for building the layered index. A layered index is robust if for any k, the number of tuples in the top k layers is minimal in comparison with all the other alternatives. The robust index guarantees the worst case performance for arbitrary query parameters. We derive a necessary and sufficient condition for robust index. The problem is shown solvable within O(n d log n) (where d is the number of dimensions, and n is the number of tuples). To reduce the high complexity of the exact solution, we develop an approximate approach, which has time complexity 0(2dn(Iog n) r(d)-1), where r(d) = [d/2] + [ d/2][d/2] Our experimental results show that our proposed method outperforms the best known previous methods.
AB - Top-k query asks for k tuples ordered according to a specific ranking function that combines the values from multiple participating attributes. The combined score function is usually linear. To efficiently answer top-k queries, preprocessing and indexing the data have been used to speed up the run time performance. Many indexing methods allow the online query algorithms progressively retrieve the data and stop at a certain point. However, in many cases, the number of data accesses is sensitive to the query parameters (i.e., linear weights in the score functions). In this paper, we study the sequentially layered indexing problem where tuples are put into multiple consecutive layers and any top-k query can be answered by at most k layers of tuples. We propose a new criterion for building the layered index. A layered index is robust if for any k, the number of tuples in the top k layers is minimal in comparison with all the other alternatives. The robust index guarantees the worst case performance for arbitrary query parameters. We derive a necessary and sufficient condition for robust index. The problem is shown solvable within O(n d log n) (where d is the number of dimensions, and n is the number of tuples). To reduce the high complexity of the exact solution, we develop an approximate approach, which has time complexity 0(2dn(Iog n) r(d)-1), where r(d) = [d/2] + [ d/2][d/2] Our experimental results show that our proposed method outperforms the best known previous methods.
UR - http://www.scopus.com/inward/record.url?scp=35448951221&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=35448951221&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:35448951221
SN - 1595933859
SN - 9781595933850
T3 - VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases
SP - 235
EP - 246
BT - VLDB 2006 - Proceedings of the 32nd International Conference on Very Large Data Bases
PB - Association for Computing Machinery
T2 - 32nd International Conference on Very Large Data Bases, VLDB 2006
Y2 - 12 September 2006 through 15 September 2006
ER -