TY - GEN
T1 - Center-piece subgraphs
T2 - KDD 2006: 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
AU - Tong, Hanghang
AU - Faloutsos, Christos
PY - 2006
Y1 - 2006
N2 - Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.
AB - Given Q nodes in a social network (say, authorship network), how can we find the node/author that is the center-piece, and has direct or indirect connections to all, or most of them? For example, this node could be the common advisor, or someone who started the research area that the Q nodes belong to. Isomorphic scenarios appear in law enforcement (find the master-mind criminal, connected to all current suspects), gene regulatory networks (find the protein that participates in pathways with all or most of the given Q proteins), viral marketing and many more. Connection subgraphs is an important first step, handling the case of Q=2 query nodes. Then, the connection subgraph algorithm finds the b intermediate nodes, that provide a good connection between the two original query nodes. Here we generalize the challenge in multiple dimensions: First, we allow more than two query nodes. Second, we allow a whole family of queries, ranging from 'OR' to 'AND', with 'softAND' in-between. Finally, we design and compare a fast approximation, and study the quality/speed trade-off. We also present experiments on the DBLP dataset. The experiments confirm that our proposed method naturally deals with multi-source queries and that the resulting subgraphs agree with our intuition. Wall-clock timing results on the DBLP dataset show that our proposed approximation achieve good accuracy for about 6: 1 speedup.
KW - Center-piece subgraph
KW - Goodness score
KW - K_softAND
UR - http://www.scopus.com/inward/record.url?scp=33749577591&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33749577591&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:33749577591
SN - 1595933395
SN - 9781595933393
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 404
EP - 413
BT - KDD 2006
Y2 - 20 August 2006 through 23 August 2006
ER -