Resource Management: Performance Assuredness in Distributed Cloud Computing via Online Reconfigurations

Mainak Ghosh, Le Xu, Indranil Gupta

Research output: Chapter in Book/Report/Conference proceedingChapter


Cloud computing relies on software for distributed batch and stream processing, as well as distributed storage. This chapter focuses on an oft-ignored angle of assuredness: performance assuredness. A significant pain point today is the inability to support reconfiguration operations, such as changing of the shard key in a sharded storage/database system, or scaling up (or down) of the number of virtual machines (VMs) being used in a stream or batch processing system. We discuss new techniques to support such reconfiguration operations in an online manner, whereby the system does not need to be shut down and the user/client-perceived behavior is indistinguishable regardless of whether a reconfiguration is occurring in the background, that is, the performance continues to be assured in spite of ongoing background reconfiguration. Next, we describe how to scale-out and scale-in (increase or decrease) the number of machines/VMs in cloud computing frameworks like distributed stream processing and distributed graph processing systems, again while offering assured performance to the customer in spite of the reconfigurations occurring in the background. The ultimate performance assuredness is the ability to support SLAs/SLOs (service-level agreements/objectives) such as deadlines. We present a new real-time scheduler that supports priorities and hard deadlines for Hadoop jobs. We implemented our reconfiguration systems as patches to several popular and open-source cloud computing systems, including MongoDB and Cassandra (storage), Storm (stream processing), LFGraph (graph processing), and Hadoop (batch processing).
Original languageEnglish (US)
Title of host publicationAssured Cloud Computing
EditorsRoy H. Campbell, Charles A. Kamhoua, Kevin A. Kwiat
PublisherWiley-IEEE Press
ISBN (Electronic)9781119428497
ISBN (Print)9781119428633
StatePublished - Dec 20 2018


  • batch processing systems
  • scale-out operations
  • scale-in operations
  • resource management
  • performance assuredness
  • online reconfigurations
  • NoSQL database
  • key-value database
  • distributed cloud computing


Dive into the research topics of 'Resource Management: Performance Assuredness in Distributed Cloud Computing via Online Reconfigurations'. Together they form a unique fingerprint.

Cite this