Scalable link-based similarity computation and clustering

Xiaoxin Yin, Jiawei Han, Gabrielle Dawn Allen

Research output: Chapter in Book/Report/Conference proceedingChapter


Data objects in a relational database are cross-linked with each other via multi-typed links. Links contain rich semantic information that may indicate important relationships among objects, such as the similarities between objects. In this chapter we explore linkage-based clustering, in which the similarity between two objects is measured based on the similarities between the objects linked with them. We study a hierarchical structure called SimTree, which represents similarities in multi-granularity manner. This method avoids the high cost of computing and storing pairwise similarities but still thoroughly explore relationships among objects. We introduce an efficient algorithm for computing similarities utilizing the SimTree.

Original languageEnglish (US)
Title of host publicationLink Mining
Subtitle of host publicationModels, Algorithms, and Applications
Number of pages27
ISBN (Electronic)9781441965158
ISBN (Print)9781441965141
StatePublished - 2010

ASJC Scopus subject areas

  • General Medicine


Dive into the research topics of 'Scalable link-based similarity computation and clustering'. Together they form a unique fingerprint.

Cite this