TY - GEN
T1 - Fast inbound top-k query for random walk with restart
AU - Zhang, Chao
AU - Jiang, Shan
AU - Chen, Yucheng
AU - Sun, Yidan
AU - Han, Jiawei
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - Random walk with restart (RWR) is widely recognized as one of the most important node proximity measures for graphs, as it captures the holistic graph structure and is robust to noise in the graph. In this paper, we study a novel query based on the RWR measure, called the inbound top-k (Ink) query. Given a query node q and a number k, the Ink query aims at retrieving knodes in the graph that have the largest weighted RWR scores to q. Inkqueries can be highly useful for various applications such as traffic scheduling, disease treatment, and targeted advertising. Nevertheless, none of the existing RWR computation techniques can accurately and efficiently process the Inkquery in large graphs. We propose two algorithms, namely Squeeze and Ripple, both of which can accurately answer the Ink query in a fast and incremental manner. To identify the top-k nodes, Squeeze iteratively performs matrix-vector multiplication and estimates the lower and upper bounds for all the nodes in the graph. Ripple employs a more aggressive strategy by only estimating the RWR scores for the nodes falling in the vicinity of q, the nodes outside the vicinity do not need to be evaluated because their RWR scores are propagated from the boundary of the vicinity and thus upper bounded. Ripple incrementally expands the vicinity until the top-k result set can be obtained. Our extensive experiments on real-life graph data sets show that Ink queries can retrieve interesting results, and the proposed algorithms are orders of magnitude faster than state-of-the-art method.
AB - Random walk with restart (RWR) is widely recognized as one of the most important node proximity measures for graphs, as it captures the holistic graph structure and is robust to noise in the graph. In this paper, we study a novel query based on the RWR measure, called the inbound top-k (Ink) query. Given a query node q and a number k, the Ink query aims at retrieving knodes in the graph that have the largest weighted RWR scores to q. Inkqueries can be highly useful for various applications such as traffic scheduling, disease treatment, and targeted advertising. Nevertheless, none of the existing RWR computation techniques can accurately and efficiently process the Inkquery in large graphs. We propose two algorithms, namely Squeeze and Ripple, both of which can accurately answer the Ink query in a fast and incremental manner. To identify the top-k nodes, Squeeze iteratively performs matrix-vector multiplication and estimates the lower and upper bounds for all the nodes in the graph. Ripple employs a more aggressive strategy by only estimating the RWR scores for the nodes falling in the vicinity of q, the nodes outside the vicinity do not need to be evaluated because their RWR scores are propagated from the boundary of the vicinity and thus upper bounded. Ripple incrementally expands the vicinity until the top-k result set can be obtained. Our extensive experiments on real-life graph data sets show that Ink queries can retrieve interesting results, and the proposed algorithms are orders of magnitude faster than state-of-the-art method.
UR - http://www.scopus.com/inward/record.url?scp=84959326710&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959326710&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-23525-7_37
DO - 10.1007/978-3-319-23525-7_37
M3 - Conference contribution
C2 - 26709392
AN - SCOPUS:84959326710
SN - 9783319235240
SN - 9783319235240
SN - 9783319235240
SN - 9783319235240
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 608
EP - 624
BT - Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2015
A2 - Costa, Vitor Santos
A2 - Soares, Carlos
A2 - Appice, Annalisa
A2 - Appice, Annalisa
A2 - Rodrigues, Pedro Pereira
A2 - Costa, Vitor Santos
A2 - Soares, Carlos
A2 - Gama, João
A2 - Jorge, Alípio
A2 - Rodrigues, Pedro Pereira
A2 - Gama, João
A2 - Costa, Vitor Santos
A2 - Jorge, Alípio
A2 - Appice, Annalisa
A2 - Rodrigues, Pedro Pereira
A2 - Gama, João
A2 - Appice, Annalisa
A2 - Soares, Carlos
A2 - Jorge, Alípio
A2 - Gama, João
A2 - Rodrigues, Pedro Pereira
A2 - Costa, Vitor Santos
A2 - Soares, Carlos
A2 - Jorge, Alípio
PB - Springer
T2 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2015
Y2 - 7 September 2015 through 11 September 2015
ER -