Scalable link-based similarity computation and clustering

Xiaoxin Yin, Jiawei Han, Philip S. Yu

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Data objects in a relational database are cross-linked with each other via multi-typed links. Links contain rich semantic information that may indicate important relationships among objects, such as the similarities between objects. In this chapter we explore linkage-based clustering, in which the similarity between two objects is measured based on the similarities between the objects linked with them. We study a hierarchical structure called SimTree, which represents similarities in multi-granularity manner. This method avoids the high cost of computing and storing pairwise similarities but still thoroughly explore relationships among objects. We introduce an efficient algorithm for computing similarities utilizing the SimTree.

Original languageEnglish (US)
Title of host publicationLink Mining
Subtitle of host publicationModels, Algorithms, and Applications
PublisherSpringer New York
Pages45-71
Number of pages27
Volume9781441965158
ISBN (Electronic)9781441965158
ISBN (Print)9781441965141
DOIs
StatePublished - Jan 1 2010

Fingerprint

Cluster Analysis
Semantics
Databases
Costs and Cost Analysis
Object Attachment

ASJC Scopus subject areas

  • Medicine(all)

Cite this

Yin, X., Han, J., & Yu, P. S. (2010). Scalable link-based similarity computation and clustering. In Link Mining: Models, Algorithms, and Applications (Vol. 9781441965158, pp. 45-71). Springer New York. https://doi.org/10.1007/978-1-4419-6515-8-2

Scalable link-based similarity computation and clustering. / Yin, Xiaoxin; Han, Jiawei; Yu, Philip S.

Link Mining: Models, Algorithms, and Applications. Vol. 9781441965158 Springer New York, 2010. p. 45-71.

Research output: Chapter in Book/Report/Conference proceedingChapter

Yin, X, Han, J & Yu, PS 2010, Scalable link-based similarity computation and clustering. in Link Mining: Models, Algorithms, and Applications. vol. 9781441965158, Springer New York, pp. 45-71. https://doi.org/10.1007/978-1-4419-6515-8-2
Yin X, Han J, Yu PS. Scalable link-based similarity computation and clustering. In Link Mining: Models, Algorithms, and Applications. Vol. 9781441965158. Springer New York. 2010. p. 45-71 https://doi.org/10.1007/978-1-4419-6515-8-2
Yin, Xiaoxin ; Han, Jiawei ; Yu, Philip S. / Scalable link-based similarity computation and clustering. Link Mining: Models, Algorithms, and Applications. Vol. 9781441965158 Springer New York, 2010. pp. 45-71
@inbook{491ae15d514041489a91cdb2d0f2dce6,
title = "Scalable link-based similarity computation and clustering",
abstract = "Data objects in a relational database are cross-linked with each other via multi-typed links. Links contain rich semantic information that may indicate important relationships among objects, such as the similarities between objects. In this chapter we explore linkage-based clustering, in which the similarity between two objects is measured based on the similarities between the objects linked with them. We study a hierarchical structure called SimTree, which represents similarities in multi-granularity manner. This method avoids the high cost of computing and storing pairwise similarities but still thoroughly explore relationships among objects. We introduce an efficient algorithm for computing similarities utilizing the SimTree.",
author = "Xiaoxin Yin and Jiawei Han and Yu, {Philip S.}",
year = "2010",
month = "1",
day = "1",
doi = "10.1007/978-1-4419-6515-8-2",
language = "English (US)",
isbn = "9781441965141",
volume = "9781441965158",
pages = "45--71",
booktitle = "Link Mining",
publisher = "Springer New York",

}

TY - CHAP

T1 - Scalable link-based similarity computation and clustering

AU - Yin, Xiaoxin

AU - Han, Jiawei

AU - Yu, Philip S.

PY - 2010/1/1

Y1 - 2010/1/1

N2 - Data objects in a relational database are cross-linked with each other via multi-typed links. Links contain rich semantic information that may indicate important relationships among objects, such as the similarities between objects. In this chapter we explore linkage-based clustering, in which the similarity between two objects is measured based on the similarities between the objects linked with them. We study a hierarchical structure called SimTree, which represents similarities in multi-granularity manner. This method avoids the high cost of computing and storing pairwise similarities but still thoroughly explore relationships among objects. We introduce an efficient algorithm for computing similarities utilizing the SimTree.

AB - Data objects in a relational database are cross-linked with each other via multi-typed links. Links contain rich semantic information that may indicate important relationships among objects, such as the similarities between objects. In this chapter we explore linkage-based clustering, in which the similarity between two objects is measured based on the similarities between the objects linked with them. We study a hierarchical structure called SimTree, which represents similarities in multi-granularity manner. This method avoids the high cost of computing and storing pairwise similarities but still thoroughly explore relationships among objects. We introduce an efficient algorithm for computing similarities utilizing the SimTree.

UR - http://www.scopus.com/inward/record.url?scp=84907577253&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84907577253&partnerID=8YFLogxK

U2 - 10.1007/978-1-4419-6515-8-2

DO - 10.1007/978-1-4419-6515-8-2

M3 - Chapter

AN - SCOPUS:84907577253

SN - 9781441965141

VL - 9781441965158

SP - 45

EP - 71

BT - Link Mining

PB - Springer New York

ER -