Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers

Research output: Contribution to journalArticle

Abstract

The model is a service system, consisting of several large server pools. A server’s processing speed and buffer size (which may be finite or infinite) depend on the pool. The input flow of customers is split equally among a fixed number of routers, which must assign customers to the servers immediately upon arrival. We consider an asymptotic regime in which the total customer arrival rate and pool sizes scale to infinity simultaneously, in proportion to a scaling parameter n, while the number of routers remains fixed. We define and study a multi-router generalization of the pull-based customer assignment (routing) algorithm PULL, introduced in Stolyar (Queueing Syst 80(4): 341–361, 2015) for the single-router model. Under the PULL algorithm, when a server becomes idle it sends a “pull-message” to a randomly uniformly selected router; each router operates independently—it assigns an arriving customer to a server according to a randomly uniformly chosen available (at this router) pull-message, if there is any, or to a randomly uniformly selected server in the entire system otherwise. Under Markov assumptions (Poisson arrival process and independent exponentially distributed service requirements), and under subcritical system load, we prove asymptotic optimality of PULL: as n→ ∞, the steady-state probability of an arriving customer experiencing blocking or waiting vanishes. Furthermore, PULL has an extremely low router–server message exchange rate of one message per customer. These results generalize some of the single-router results in Stolyar (2015).

Original languageEnglish (US)
Pages (from-to)31-65
Number of pages35
JournalQueueing Systems
Volume85
Issue number1-2
DOIs
StatePublished - Feb 1 2017

Fingerprint

Routers
Servers
Pull
Routing algorithms
Processing

Keywords

  • Asymptotic optimality
  • Fluid limits
  • Large-scale heterogeneous service systems
  • Load balancing
  • Multiple routers (dispatchers)
  • Pull-based load distribution

ASJC Scopus subject areas

  • Statistics and Probability
  • Computer Science Applications
  • Management Science and Operations Research
  • Computational Theory and Mathematics

Cite this

Pull-based load distribution among heterogeneous parallel servers : the case of multiple routers. / Stolyar, Alexander L.

In: Queueing Systems, Vol. 85, No. 1-2, 01.02.2017, p. 31-65.

Research output: Contribution to journalArticle

@article{2db47b47937d4f298c4099c8b5648ad7,
title = "Pull-based load distribution among heterogeneous parallel servers: the case of multiple routers",
abstract = "The model is a service system, consisting of several large server pools. A server’s processing speed and buffer size (which may be finite or infinite) depend on the pool. The input flow of customers is split equally among a fixed number of routers, which must assign customers to the servers immediately upon arrival. We consider an asymptotic regime in which the total customer arrival rate and pool sizes scale to infinity simultaneously, in proportion to a scaling parameter n, while the number of routers remains fixed. We define and study a multi-router generalization of the pull-based customer assignment (routing) algorithm PULL, introduced in Stolyar (Queueing Syst 80(4): 341–361, 2015) for the single-router model. Under the PULL algorithm, when a server becomes idle it sends a “pull-message” to a randomly uniformly selected router; each router operates independently—it assigns an arriving customer to a server according to a randomly uniformly chosen available (at this router) pull-message, if there is any, or to a randomly uniformly selected server in the entire system otherwise. Under Markov assumptions (Poisson arrival process and independent exponentially distributed service requirements), and under subcritical system load, we prove asymptotic optimality of PULL: as n→ ∞, the steady-state probability of an arriving customer experiencing blocking or waiting vanishes. Furthermore, PULL has an extremely low router–server message exchange rate of one message per customer. These results generalize some of the single-router results in Stolyar (2015).",
keywords = "Asymptotic optimality, Fluid limits, Large-scale heterogeneous service systems, Load balancing, Multiple routers (dispatchers), Pull-based load distribution",
author = "Stolyar, {Alexander L.}",
year = "2017",
month = "2",
day = "1",
doi = "10.1007/s11134-016-9508-8",
language = "English (US)",
volume = "85",
pages = "31--65",
journal = "Queueing Systems",
issn = "0257-0130",
publisher = "Springer Netherlands",
number = "1-2",

}

TY - JOUR

T1 - Pull-based load distribution among heterogeneous parallel servers

T2 - the case of multiple routers

AU - Stolyar, Alexander L.

PY - 2017/2/1

Y1 - 2017/2/1

N2 - The model is a service system, consisting of several large server pools. A server’s processing speed and buffer size (which may be finite or infinite) depend on the pool. The input flow of customers is split equally among a fixed number of routers, which must assign customers to the servers immediately upon arrival. We consider an asymptotic regime in which the total customer arrival rate and pool sizes scale to infinity simultaneously, in proportion to a scaling parameter n, while the number of routers remains fixed. We define and study a multi-router generalization of the pull-based customer assignment (routing) algorithm PULL, introduced in Stolyar (Queueing Syst 80(4): 341–361, 2015) for the single-router model. Under the PULL algorithm, when a server becomes idle it sends a “pull-message” to a randomly uniformly selected router; each router operates independently—it assigns an arriving customer to a server according to a randomly uniformly chosen available (at this router) pull-message, if there is any, or to a randomly uniformly selected server in the entire system otherwise. Under Markov assumptions (Poisson arrival process and independent exponentially distributed service requirements), and under subcritical system load, we prove asymptotic optimality of PULL: as n→ ∞, the steady-state probability of an arriving customer experiencing blocking or waiting vanishes. Furthermore, PULL has an extremely low router–server message exchange rate of one message per customer. These results generalize some of the single-router results in Stolyar (2015).

AB - The model is a service system, consisting of several large server pools. A server’s processing speed and buffer size (which may be finite or infinite) depend on the pool. The input flow of customers is split equally among a fixed number of routers, which must assign customers to the servers immediately upon arrival. We consider an asymptotic regime in which the total customer arrival rate and pool sizes scale to infinity simultaneously, in proportion to a scaling parameter n, while the number of routers remains fixed. We define and study a multi-router generalization of the pull-based customer assignment (routing) algorithm PULL, introduced in Stolyar (Queueing Syst 80(4): 341–361, 2015) for the single-router model. Under the PULL algorithm, when a server becomes idle it sends a “pull-message” to a randomly uniformly selected router; each router operates independently—it assigns an arriving customer to a server according to a randomly uniformly chosen available (at this router) pull-message, if there is any, or to a randomly uniformly selected server in the entire system otherwise. Under Markov assumptions (Poisson arrival process and independent exponentially distributed service requirements), and under subcritical system load, we prove asymptotic optimality of PULL: as n→ ∞, the steady-state probability of an arriving customer experiencing blocking or waiting vanishes. Furthermore, PULL has an extremely low router–server message exchange rate of one message per customer. These results generalize some of the single-router results in Stolyar (2015).

KW - Asymptotic optimality

KW - Fluid limits

KW - Large-scale heterogeneous service systems

KW - Load balancing

KW - Multiple routers (dispatchers)

KW - Pull-based load distribution

UR - http://www.scopus.com/inward/record.url?scp=84997287931&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84997287931&partnerID=8YFLogxK

U2 - 10.1007/s11134-016-9508-8

DO - 10.1007/s11134-016-9508-8

M3 - Article

AN - SCOPUS:84997287931

VL - 85

SP - 31

EP - 65

JO - Queueing Systems

JF - Queueing Systems

SN - 0257-0130

IS - 1-2

ER -