On Link-based similarity join

Liwen Sun, Reynold Cheng, Xiang Li, David W. Cheung, Jiawei Han

Research output: Contribution to journalConference articlepeer-review

Abstract

Graphs can be found in applications like social networks, bibliographic networks, and biological databases. Understanding the relationship, or links, among graph nodes enables applications such as link prediction, recommendation, and spam detection. In this paper, we propose link-based similarity join (LS-join), which extends the similarity join operator to link-based measures. Given two sets of nodes in a graph, the LS-join returns all pairs of nodes that are highly similar to each other, with respect to an e-function. The e-function generalizes common measures like Personalized PageRank (PPR) and SimRank (SR). We study an efficient LS-join algorithm on a large graph. We further improve our solutions for PPR and SR, which involve expensive randomwalk operations. We validate our solutions by performing extensive experiments on three real graph datasets.

Original languageEnglish (US)
Pages (from-to)714-725
Number of pages12
JournalProceedings of the VLDB Endowment
Volume4
Issue number11
DOIs
StatePublished - Aug 2011
Event37th International Conference on Very Large Data Bases, VLDB 2011 - Seattle, United States
Duration: Aug 29 2011Sep 3 2011

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • General Computer Science

Fingerprint

Dive into the research topics of 'On Link-based similarity join'. Together they form a unique fingerprint.

Cite this