TY - GEN
T1 - Mining connection pathways for marked nodes in large graphs
AU - Akoglu, Leman
AU - Vreeken, Jilles
AU - Tong, Hanghang
AU - Chau, Duen Horng
AU - Tatti, Nikolaj
AU - Faloutsos, Christos
N1 - Publisher Copyright:
Copyright © SIAM.
PY - 2013
Y1 - 2013
N2 - Suppose we are given a large graph in which, by some external process, a handful of nodes are marked. What can we say about these nodes? Are they close together in the graph? or, if segregated, how many groups do they form? We approach this problem by trying to find sets of simple connection pathways between sets of marked nodes. We formalize the problem in terms of the Minimum Description Length principle: a pathway is simple when we need only few bits to tell which edges to follow, such that we visit all nodes in a group. Then, the best partitioning is the one that requires the least number of bits to describe the paths that visit all the marked nodes. We prove that solving this problem is NP-hard, and introduce DOT2DOT, an efficient algorithm for partitioning marked nodes by finding simple pathways between nodes. Experimentation shows that DOT2DOT correctly groups nodes for which good connection paths can be constructed, while separating distant nodes.
AB - Suppose we are given a large graph in which, by some external process, a handful of nodes are marked. What can we say about these nodes? Are they close together in the graph? or, if segregated, how many groups do they form? We approach this problem by trying to find sets of simple connection pathways between sets of marked nodes. We formalize the problem in terms of the Minimum Description Length principle: a pathway is simple when we need only few bits to tell which edges to follow, such that we visit all nodes in a group. Then, the best partitioning is the one that requires the least number of bits to describe the paths that visit all the marked nodes. We prove that solving this problem is NP-hard, and introduce DOT2DOT, an efficient algorithm for partitioning marked nodes by finding simple pathways between nodes. Experimentation shows that DOT2DOT correctly groups nodes for which good connection paths can be constructed, while separating distant nodes.
UR - http://www.scopus.com/inward/record.url?scp=84960112617&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84960112617&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972832.5
DO - 10.1137/1.9781611972832.5
M3 - Conference contribution
AN - SCOPUS:84960112617
T3 - Proceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013
SP - 37
EP - 45
BT - Proceedings of the 2013 SIAM International Conference on Data Mining, SDM 2013
A2 - Ghosh, Joydeep
A2 - Obradovic, Zoran
A2 - Dy, Jennifer
A2 - Zhou, Zhi-Hua
A2 - Kamath, Chandrika
A2 - Parthasarathy, Srinivasan
PB - Siam Society under Royal Patronage
T2 - SIAM International Conference on Data Mining, SDM 2013
Y2 - 2 May 2013 through 4 May 2013
ER -