TY - GEN
T1 - Partitioning social networks for fast retrieval of time-dependent queries
AU - Yuan, Mindi
AU - Stein, David
AU - Carrasco, Berenice
AU - Trindade, Joana M.F.
AU - Lu, Yi
PY - 2012
Y1 - 2012
N2 - Online social network (OSN) queries require retrievals of multiple small records generated by different users in the network, and the set of records to be retrieved is time dependent. Current implementation of hash-based partitioning results in accesses at a large number of servers, which significantly degrades response time. Partitioning the OSN friendship graph is difficult as its power-law degree distribution leads to many cross-partition edges. Naive replication requires extra storage that is orders of magnitude larger. In our previous work (2011), we proposed to partition not only the spatial network of social relations, but also in the time dimension so that users who have communicated in a given period are grouped together. We built an activity prediction graph (APG) to keep in one partition newly created data that are highly likely to be accessed together. In this paper, we analyze the distribution of the Facebook wall posts in the New Orleans network. We further emphasize that the objective of partitioning is to keep the two-hop neighborhood of a user in one partition, instead of the one-hop network usually considered. Two-hop neighborhoods are the basic units of retrieval in OSN and can be much larger than one-hop networks. We use a static partitioning method based on KMETIS, and a dynamic local partitioning method that maintains evenness and requires only a small amount of data movement across partitions. For evaluation, the partitioning results are tested with emulation of Facebook page downloads. We show that partitioning on twohop networks yields at lest 19% more local queries than its one-hop counterpart. The static algorithm achieves 5.6 times better data locality than hash-based partitioning and the dynamic algorithm achieves 6.4 times better locality while keeping the number of movements small. Almost all queries are kept in at most 3 partitions for both algorithms.
AB - Online social network (OSN) queries require retrievals of multiple small records generated by different users in the network, and the set of records to be retrieved is time dependent. Current implementation of hash-based partitioning results in accesses at a large number of servers, which significantly degrades response time. Partitioning the OSN friendship graph is difficult as its power-law degree distribution leads to many cross-partition edges. Naive replication requires extra storage that is orders of magnitude larger. In our previous work (2011), we proposed to partition not only the spatial network of social relations, but also in the time dimension so that users who have communicated in a given period are grouped together. We built an activity prediction graph (APG) to keep in one partition newly created data that are highly likely to be accessed together. In this paper, we analyze the distribution of the Facebook wall posts in the New Orleans network. We further emphasize that the objective of partitioning is to keep the two-hop neighborhood of a user in one partition, instead of the one-hop network usually considered. Two-hop neighborhoods are the basic units of retrieval in OSN and can be much larger than one-hop networks. We use a static partitioning method based on KMETIS, and a dynamic local partitioning method that maintains evenness and requires only a small amount of data movement across partitions. For evaluation, the partitioning results are tested with emulation of Facebook page downloads. We show that partitioning on twohop networks yields at lest 19% more local queries than its one-hop counterpart. The static algorithm achieves 5.6 times better data locality than hash-based partitioning and the dynamic algorithm achieves 6.4 times better locality while keeping the number of movements small. Almost all queries are kept in at most 3 partitions for both algorithms.
UR - http://www.scopus.com/inward/record.url?scp=84869053571&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84869053571&partnerID=8YFLogxK
U2 - 10.1109/ICDEW.2012.63
DO - 10.1109/ICDEW.2012.63
M3 - Conference contribution
AN - SCOPUS:84869053571
SN - 9780769547480
T3 - Proceedings - 2012 IEEE 28th International Conference on Data Engineering Workshops, ICDEW 2012
SP - 205
EP - 212
BT - Proceedings - 2012 IEEE 28th International Conference on Data Engineering Workshops, ICDEW 2012
T2 - 2012 IEEE 28th International Conference on Data Engineering Workshops, ICDEW 2012
Y2 - 1 April 2012 through 5 April 2012
ER -