TY - GEN
T1 - Bobtail
T2 - 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2013
AU - Xu, Yunjing
AU - Musgrave, Zachary
AU - Noble, Brian
AU - Bailey, Michael
N1 - Publisher Copyright:
© Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2013. All rights reserved.
PY - 2013
Y1 - 2013
N2 - Highly modular data center applications such as Bing, Facebook, and Amazon's retail platform are known to be susceptible to long tails in response times. Services such as Amazon's EC2 have proven attractive platforms for building similar applications. Unfortunately, virtualization used in such platforms exacerbates the long tail problem by factors of two to four. Surprisingly, we find that poor response times in EC2 are a property of nodes rather than the network, and that this property of nodes is both pervasive throughout EC2 and persistent over time. The root cause of this problem is co-scheduling of CPU-bound and latency-sensitive tasks. We leverage these observations in Bobtail, a system that proactively detects and avoids these bad neighboring VMs without significantly penalizing node instantiation. With Bobtail, common communication patterns benefit from reductions of up to 40% in 99.9th percentile response times.
AB - Highly modular data center applications such as Bing, Facebook, and Amazon's retail platform are known to be susceptible to long tails in response times. Services such as Amazon's EC2 have proven attractive platforms for building similar applications. Unfortunately, virtualization used in such platforms exacerbates the long tail problem by factors of two to four. Surprisingly, we find that poor response times in EC2 are a property of nodes rather than the network, and that this property of nodes is both pervasive throughout EC2 and persistent over time. The root cause of this problem is co-scheduling of CPU-bound and latency-sensitive tasks. We leverage these observations in Bobtail, a system that proactively detects and avoids these bad neighboring VMs without significantly penalizing node instantiation. With Bobtail, common communication patterns benefit from reductions of up to 40% in 99.9th percentile response times.
UR - http://www.scopus.com/inward/record.url?scp=85076715564&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85076715564&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85076715564
T3 - Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2013
SP - 329
EP - 341
BT - Proceedings of the 10th USENIX Symposium on Networked Systems Design and Implementation, NSDI 2013
PB - USENIX Association
Y2 - 2 April 2013 through 5 April 2013
ER -