TY - GEN
T1 - Latency-aware data partitioning for geo-replicated Online Social Networks
AU - Jiao, Lei
AU - Xu, Tianyin
AU - Li, Jun
AU - Fu, Xiaoming
N1 - Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2011
Y1 - 2011
N2 - Large-Scale Online Social Networks (OSNs) usually employ data replication across multiple datacenters in multiple geo-locations to ensure high availability and performance [1]. The de facto method for data replication in current OSNs (e.g., Facebook) is full replication which enables each geo-distributed datacenter to maintain one copy of all the data. The full replication method can simply achieve good performance but poses high overhead for maintenance (e.g., replica storage and synchronization). Firstly, full replication leads to linear storage growth with the increasing of datacenter deployment, which is of poor scalability. Secondly, the data replicas across all the locations requires synchronization, resulting in large inter-datacenter WAN traffic which is very expensive. The ideal solution is to partition user data across multiple datacenters, making each geo-distributed datacenter to maintain one partition of the whole data set. Unfortunately, partitioning OSN data by tradition graph algorithms is known to be very difficult due to the high interconnection and inter-dependency within the OSN data [2]. Besides, geo-partitioning goes beyond the traditional graph partitioning problems because the user-perceived latency is a critical Quality-of-Service (QoS) issue to be considered.
AB - Large-Scale Online Social Networks (OSNs) usually employ data replication across multiple datacenters in multiple geo-locations to ensure high availability and performance [1]. The de facto method for data replication in current OSNs (e.g., Facebook) is full replication which enables each geo-distributed datacenter to maintain one copy of all the data. The full replication method can simply achieve good performance but poses high overhead for maintenance (e.g., replica storage and synchronization). Firstly, full replication leads to linear storage growth with the increasing of datacenter deployment, which is of poor scalability. Secondly, the data replicas across all the locations requires synchronization, resulting in large inter-datacenter WAN traffic which is very expensive. The ideal solution is to partition user data across multiple datacenters, making each geo-distributed datacenter to maintain one partition of the whole data set. Unfortunately, partitioning OSN data by tradition graph algorithms is known to be very difficult due to the high interconnection and inter-dependency within the OSN data [2]. Besides, geo-partitioning goes beyond the traditional graph partitioning problems because the user-perceived latency is a critical Quality-of-Service (QoS) issue to be considered.
UR - http://www.scopus.com/inward/record.url?scp=84862912977&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862912977&partnerID=8YFLogxK
U2 - 10.1145/2088960.2088975
DO - 10.1145/2088960.2088975
M3 - Conference contribution
AN - SCOPUS:84862912977
SN - 9781450310734
T3 - Proceedings of the Workshop on Posters and Demos Track, PDT'11 - 12th International Middleware Conference, Middleware'11
BT - Proceedings of the Workshop on Posters and Demos Track, PDT'11 - 12th International Middleware Conference, Middleware'11
T2 - Workshop on Posters and Demos Track, PDT'11 - 12th International Middleware Conference, Middleware'11
Y2 - 12 December 2011 through 12 December 2011
ER -