Mining connection pathways for marked nodes in large graphs

Leman Akoglu, Jilles Vreeken, Hanghang Tong, Duen Horng Chau, Nikolaj Tatti, Christos Faloutsos

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Suppose we are given a large graph in which, by some external process, a handful of nodes are marked. What can we say about these nodes? Are they close together in the graph? or, if segregated, how many groups do they form? We approach this problem by trying to find sets of simple connection pathways between sets of marked nodes. We formalize the problem in terms of the Minimum Description Length principle: a pathway is simple when we need only few bits to tell which edges to follow, such that we visit all nodes in a group. Then, the best partitioning is the one that requires the least number of bits to describe the paths that visit all the marked nodes. We prove that solving this problem is NP-hard, and introduce DOT2DOT, an efficient algorithm for partitioning marked nodes by finding simple pathways between nodes. Experimentation shows that DOT2DOT correctly groups nodes for which good connection paths can be constructed, while separating distant nodes.

Original languageEnglish (US)
Title of host publicationProceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013
EditorsJoydeep Ghosh, Zoran Obradovic, Jennifer Dy, Zhi-Hua Zhou, Chandrika Kamath, Srinivasan Parthasarathy
PublisherSiam Society under Royal Patronage
Pages37-45
Number of pages9
ISBN (Electronic)9781611972627
DOIs
StatePublished - 2013
Externally publishedYes
EventSIAM International Conference on Data Mining, SDM 2013 - Austin, United States
Duration: May 2 2013May 4 2013

Publication series

NameProceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013

Other

OtherSIAM International Conference on Data Mining, SDM 2013
Country/TerritoryUnited States
CityAustin
Period5/2/135/4/13

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Theoretical Computer Science
  • Information Systems
  • Signal Processing

Fingerprint

Dive into the research topics of 'Mining connection pathways for marked nodes in large graphs'. Together they form a unique fingerprint.

Cite this