Pull-based load distribution in large-scale heterogeneous service systems

Research output: Contribution to journalArticle

Abstract

The model is motivated by the problem of load distribution in large-scale cloud-based data processing systems. We consider a heterogeneous service system, consisting of multiple large server pools. The pools are different in that their servers may have different processing speeds and/or different buffer sizes (which may be finite or infinite). We study an asymptotic regime in which the customer arrival rate and pool sizes scale to infinity simultaneously, in proportion to some scaling parameter n. Arriving customers are assigned to the servers by a “router,” according to a pull-based algorithm, called PULL. Under the algorithm, each server sends a “pull-message” to the router, when it becomes idle; the router assigns an arriving customer to a server according to a randomly chosen available pull-message, if there are any, or to a random server, otherwise. Assuming subcritical system load, we prove asymptotic optimality of PULL. Namely, as system scale $$n\rightarrow \infty $$n→∞, the steady-state probability of an arriving customer experiencing blocking or waiting, vanishes. We also describe some generalizations of the model and PULL algorithm, for which the asymptotic optimality still holds.

Original languageEnglish (US)
Pages (from-to)341-361
Number of pages21
JournalQueueing Systems
Volume80
Issue number4
DOIs
StatePublished - Aug 23 2015
Externally publishedYes

    Fingerprint

Keywords

  • Asymptotic optimality
  • Fluid limits
  • Large-scale heterogeneous service systems
  • Load balancing
  • PULL algorithm
  • Pull-based load distribution
  • Stationary distribution

ASJC Scopus subject areas

  • Computer Science Applications
  • Management Science and Operations Research
  • Computational Theory and Mathematics

Cite this