Reward maximization in general dynamic matching systems

Mohammadreza Nazari, Alexander L. Stolyar

Research output: Contribution to journalArticle

Abstract

We consider a matching system with random arrivals of items of different types. The items wait in queues—one per item type—until they are “matched.” Each matching requires certain quantities of items of different types; after a matching is activated, the associated items leave the system. There exists a finite set of possible matchings, each producing a certain amount of “reward.” This model has a broad range of important applications, including assemble-to-order systems, Internet advertising, and matching web portals. We propose an optimal matching scheme in the sense that it asymptotically maximizes the long-term average matching reward, while keeping the queues stable. The scheme makes matching decisions in a specially constructed virtual system, which in turn controls decisions in the physical system. The key feature of the virtual system is that, unlike the physical one, it allows the queues to become negative. The matchings in the virtual system are controlled by an extended version of the greedy primal–dual (GPD) algorithm, which we prove to be asymptotically optimal—this in turn implies the asymptotic optimality of the entire scheme. The scheme is real time; at any time, it uses simple rules based on the current state of the virtual and physical queues. It is very robust in that it does not require any knowledge of the item arrival rates and automatically adapts to changing rates. The extended GPD algorithm and its asymptotic optimality apply to a quite general queueing network framework, not limited to matching problems, and therefore are of independent interest.

Original languageEnglish (US)
Pages (from-to)143-170
Number of pages28
JournalQueueing Systems
Volume91
Issue number1-2
DOIs
StatePublished - Feb 15 2019

Fingerprint

Queueing networks
Marketing
Internet
Reward
Queue
Greedy algorithm
Asymptotic optimality

Keywords

  • Dynamic matching
  • EGPD algorithm
  • Optimal control
  • Stability
  • Utility maximization
  • Virtual queues

ASJC Scopus subject areas

  • Computer Science Applications
  • Management Science and Operations Research
  • Computational Theory and Mathematics

Cite this

Reward maximization in general dynamic matching systems. / Nazari, Mohammadreza; Stolyar, Alexander L.

In: Queueing Systems, Vol. 91, No. 1-2, 15.02.2019, p. 143-170.

Research output: Contribution to journalArticle

Nazari, Mohammadreza ; Stolyar, Alexander L. / Reward maximization in general dynamic matching systems. In: Queueing Systems. 2019 ; Vol. 91, No. 1-2. pp. 143-170.
@article{1c6c519f3ded4b1d8919eacb1c70ea8a,
title = "Reward maximization in general dynamic matching systems",
abstract = "We consider a matching system with random arrivals of items of different types. The items wait in queues—one per item type—until they are “matched.” Each matching requires certain quantities of items of different types; after a matching is activated, the associated items leave the system. There exists a finite set of possible matchings, each producing a certain amount of “reward.” This model has a broad range of important applications, including assemble-to-order systems, Internet advertising, and matching web portals. We propose an optimal matching scheme in the sense that it asymptotically maximizes the long-term average matching reward, while keeping the queues stable. The scheme makes matching decisions in a specially constructed virtual system, which in turn controls decisions in the physical system. The key feature of the virtual system is that, unlike the physical one, it allows the queues to become negative. The matchings in the virtual system are controlled by an extended version of the greedy primal–dual (GPD) algorithm, which we prove to be asymptotically optimal—this in turn implies the asymptotic optimality of the entire scheme. The scheme is real time; at any time, it uses simple rules based on the current state of the virtual and physical queues. It is very robust in that it does not require any knowledge of the item arrival rates and automatically adapts to changing rates. The extended GPD algorithm and its asymptotic optimality apply to a quite general queueing network framework, not limited to matching problems, and therefore are of independent interest.",
keywords = "Dynamic matching, EGPD algorithm, Optimal control, Stability, Utility maximization, Virtual queues",
author = "Mohammadreza Nazari and Stolyar, {Alexander L.}",
year = "2019",
month = "2",
day = "15",
doi = "10.1007/s11134-018-9593-y",
language = "English (US)",
volume = "91",
pages = "143--170",
journal = "Queueing Systems",
issn = "0257-0130",
publisher = "Springer Netherlands",
number = "1-2",

}

TY - JOUR

T1 - Reward maximization in general dynamic matching systems

AU - Nazari, Mohammadreza

AU - Stolyar, Alexander L.

PY - 2019/2/15

Y1 - 2019/2/15

N2 - We consider a matching system with random arrivals of items of different types. The items wait in queues—one per item type—until they are “matched.” Each matching requires certain quantities of items of different types; after a matching is activated, the associated items leave the system. There exists a finite set of possible matchings, each producing a certain amount of “reward.” This model has a broad range of important applications, including assemble-to-order systems, Internet advertising, and matching web portals. We propose an optimal matching scheme in the sense that it asymptotically maximizes the long-term average matching reward, while keeping the queues stable. The scheme makes matching decisions in a specially constructed virtual system, which in turn controls decisions in the physical system. The key feature of the virtual system is that, unlike the physical one, it allows the queues to become negative. The matchings in the virtual system are controlled by an extended version of the greedy primal–dual (GPD) algorithm, which we prove to be asymptotically optimal—this in turn implies the asymptotic optimality of the entire scheme. The scheme is real time; at any time, it uses simple rules based on the current state of the virtual and physical queues. It is very robust in that it does not require any knowledge of the item arrival rates and automatically adapts to changing rates. The extended GPD algorithm and its asymptotic optimality apply to a quite general queueing network framework, not limited to matching problems, and therefore are of independent interest.

AB - We consider a matching system with random arrivals of items of different types. The items wait in queues—one per item type—until they are “matched.” Each matching requires certain quantities of items of different types; after a matching is activated, the associated items leave the system. There exists a finite set of possible matchings, each producing a certain amount of “reward.” This model has a broad range of important applications, including assemble-to-order systems, Internet advertising, and matching web portals. We propose an optimal matching scheme in the sense that it asymptotically maximizes the long-term average matching reward, while keeping the queues stable. The scheme makes matching decisions in a specially constructed virtual system, which in turn controls decisions in the physical system. The key feature of the virtual system is that, unlike the physical one, it allows the queues to become negative. The matchings in the virtual system are controlled by an extended version of the greedy primal–dual (GPD) algorithm, which we prove to be asymptotically optimal—this in turn implies the asymptotic optimality of the entire scheme. The scheme is real time; at any time, it uses simple rules based on the current state of the virtual and physical queues. It is very robust in that it does not require any knowledge of the item arrival rates and automatically adapts to changing rates. The extended GPD algorithm and its asymptotic optimality apply to a quite general queueing network framework, not limited to matching problems, and therefore are of independent interest.

KW - Dynamic matching

KW - EGPD algorithm

KW - Optimal control

KW - Stability

KW - Utility maximization

KW - Virtual queues

UR - http://www.scopus.com/inward/record.url?scp=85056461924&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85056461924&partnerID=8YFLogxK

U2 - 10.1007/s11134-018-9593-y

DO - 10.1007/s11134-018-9593-y

M3 - Article

AN - SCOPUS:85056461924

VL - 91

SP - 143

EP - 170

JO - Queueing Systems

JF - Queueing Systems

SN - 0257-0130

IS - 1-2

ER -