Scalable link-based similarity computation and clustering

Xiaoxin Yin, Jiawei Han, Philip S. Yu

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Data objects in a relational database are cross-linked with each other via multi-typed links. Links contain rich semantic information that may indicate important relationships among objects, such as the similarities between objects. In this chapter we explore linkage-based clustering, in which the similarity between two objects is measured based on the similarities between the objects linked with them. We study a hierarchical structure called SimTree, which represents similarities in multi-granularity manner. This method avoids the high cost of computing and storing pairwise similarities but still thoroughly explore relationships among objects. We introduce an efficient algorithm for computing similarities utilizing the SimTree.

Original languageEnglish (US)
Title of host publicationLink Mining
Subtitle of host publicationModels, Algorithms, and Applications
PublisherSpringer
Pages45-71
Number of pages27
Volume9781441965158
ISBN (Electronic)9781441965158
ISBN (Print)9781441965141
DOIs
StatePublished - 2010

ASJC Scopus subject areas

  • General Medicine

Fingerprint

Dive into the research topics of 'Scalable link-based similarity computation and clustering'. Together they form a unique fingerprint.

Cite this