Partitioning social networks for fast retrieval of time-dependent queries

Mindi Yuan, David Stein, Berenice Carrasco, Joana M.F. Trindade, Yi Lu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Online social network (OSN) queries require retrievals of multiple small records generated by different users in the network, and the set of records to be retrieved is time dependent. Current implementation of hash-based partitioning results in accesses at a large number of servers, which significantly degrades response time. Partitioning the OSN friendship graph is difficult as its power-law degree distribution leads to many cross-partition edges. Naive replication requires extra storage that is orders of magnitude larger. In our previous work (2011), we proposed to partition not only the spatial network of social relations, but also in the time dimension so that users who have communicated in a given period are grouped together. We built an activity prediction graph (APG) to keep in one partition newly created data that are highly likely to be accessed together. In this paper, we analyze the distribution of the Facebook wall posts in the New Orleans network. We further emphasize that the objective of partitioning is to keep the two-hop neighborhood of a user in one partition, instead of the one-hop network usually considered. Two-hop neighborhoods are the basic units of retrieval in OSN and can be much larger than one-hop networks. We use a static partitioning method based on KMETIS, and a dynamic local partitioning method that maintains evenness and requires only a small amount of data movement across partitions. For evaluation, the partitioning results are tested with emulation of Facebook page downloads. We show that partitioning on twohop networks yields at lest 19% more local queries than its one-hop counterpart. The static algorithm achieves 5.6 times better data locality than hash-based partitioning and the dynamic algorithm achieves 6.4 times better locality while keeping the number of movements small. Almost all queries are kept in at most 3 partitions for both algorithms.

Original languageEnglish (US)
Title of host publicationProceedings - 2012 IEEE 28th International Conference on Data Engineering Workshops, ICDEW 2012
Pages205-212
Number of pages8
DOIs
StatePublished - Nov 19 2012
Event2012 IEEE 28th International Conference on Data Engineering Workshops, ICDEW 2012 - Arlington, VA, United States
Duration: Apr 1 2012Apr 5 2012

Publication series

NameProceedings - 2012 IEEE 28th International Conference on Data Engineering Workshops, ICDEW 2012

Other

Other2012 IEEE 28th International Conference on Data Engineering Workshops, ICDEW 2012
CountryUnited States
CityArlington, VA
Period4/1/124/5/12

ASJC Scopus subject areas

  • Software

Fingerprint Dive into the research topics of 'Partitioning social networks for fast retrieval of time-dependent queries'. Together they form a unique fingerprint.

Cite this