Bandits with budgets

Chong Jiang, R. Srikant

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Motivated by online advertising applications, we consider a version of the classical multi-armed bandit problem where there is a cost associated with pulling each arm, and a corresponding budget which limits the number of times that an arm can be pulled. We derive regret bounds on the expected reward in such a bandit problem using a modification of the well-known upper confidence bound algorithm UCB1.

Original languageEnglish (US)
Title of host publication2013 IEEE 52nd Annual Conference on Decision and Control, CDC 2013
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5345-5350
Number of pages6
ISBN (Print)9781467357173
DOIs
StatePublished - 2013
Event52nd IEEE Conference on Decision and Control, CDC 2013 - Florence, Italy
Duration: Dec 10 2013Dec 13 2013

Publication series

NameProceedings of the IEEE Conference on Decision and Control
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370

Other

Other52nd IEEE Conference on Decision and Control, CDC 2013
Country/TerritoryItaly
CityFlorence
Period12/10/1312/13/13

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Modeling and Simulation
  • Control and Optimization

Fingerprint

Dive into the research topics of 'Bandits with budgets'. Together they form a unique fingerprint.

Cite this