Content-based scheduling of virtual machines (VMs) in the cloud

Sobir Bazarbayev, Matti Hiltunen, Kaustubh Joshi, William H. Sanders, Richard Schlichting

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Organizations of all sizes are shifting their IT infrastructures to the cloud because of its cost efficiency and convenience. Because of the on-demand nature of the Infrastructure as a Service (IaaS) clouds, hundreds of thousands of virtual machines (VMs) may be deployed and terminated in a single large cloud data center each day. In this paper, we propose a content-based scheduling algorithm for the placement of VMs in data centers. We take advantage of the fact that it is possible to find identical disk blocks in different VM disk images with similar operating systems by scheduling VMs with high content similarity on the same hosts. That allows us to reduce the amount of data transferred when deploying a VM on a destination host. In this paper, we first present our study of content similarity between different VMs, based on a large set of VMs with different operating systems that represent the majority of popular operating systems in use today. Our analysis shows that content similarity between VMs with the same operating system and close version numbers (e.g., Ubuntu 12.04 vs. Ubuntu 11.10) can be as high as 60%. We also show that there is close to zero content similarity between VMs with different operating systems. Second, based on the above results, we designed a content-based scheduling algorithm that lowers the network traffic associated with transfer of VM disk images inside data centers. Our experimental results show that the amount of data transfer associated with deployment of VMs and transfer of virtual disk images can be lowered by more than 70%, resulting in significant savings in data center network utilization and congestion.

Original languageEnglish (US)
Title of host publicationProceedings - 2013 IEEE 33rd International Conference on Distributed Computing Systems, ICDCS 2013
Pages93-101
Number of pages9
DOIs
StatePublished - Dec 1 2013
Event2013 IEEE 33rd International Conference on Distributed Computing Systems, ICDCS 2013 - Philadelphia, PA, United States
Duration: Jul 8 2013Jul 11 2013

Publication series

NameProceedings - International Conference on Distributed Computing Systems

Other

Other2013 IEEE 33rd International Conference on Distributed Computing Systems, ICDCS 2013
CountryUnited States
CityPhiladelphia, PA
Period7/8/137/11/13

    Fingerprint

Keywords

  • Cloud-computing
  • Data center
  • Scheduling
  • Virtualization

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Cite this

Bazarbayev, S., Hiltunen, M., Joshi, K., Sanders, W. H., & Schlichting, R. (2013). Content-based scheduling of virtual machines (VMs) in the cloud. In Proceedings - 2013 IEEE 33rd International Conference on Distributed Computing Systems, ICDCS 2013 (pp. 93-101). [6681579] (Proceedings - International Conference on Distributed Computing Systems). https://doi.org/10.1109/ICDCS.2013.15