Finding a most biased coin with fewest flips

Research output: Contribution to journalConference articlepeer-review


We study the problem of learning a most biased coin among a set of coins by tossing the coins adaptively. The goal is to minimize the number of tosses until we identify a coin whose posterior probability of being most biased is at least 1 - δ for a given δ. Under a particular probabilistic model, we give an optimal algorithm, i.e., an algorithm that minimizes the expected number of future tosses. The problem is closely related to finding the best arm in the multi-armed bandit problem using adaptive strategies. Our algorithm employs an optimal adaptive strategy-a strategy that performs the best possible action at each step after observing the outcomes of all previous coin tosses. Consequently, our algorithm is also optimal for any given starting history of outcomes. To our knowledge, this is the first algorithm that employs an optimal adaptive strategy under a Bayesian setting for this problem. Our proof of optimality employs mathematical tools from the area of Markov games.

Original languageEnglish (US)
Pages (from-to)394-407
Number of pages14
JournalJournal of Machine Learning Research
StatePublished - Jan 1 2014
Externally publishedYes
Event27th Conference on Learning Theory, COLT 2014 - Barcelona, Spain
Duration: Jun 13 2014Jun 15 2014


  • Algorithms
  • Bandits
  • Bayesian
  • Biased coin
  • Learning
  • Ranking and selection
  • Sequential selection

ASJC Scopus subject areas

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence


Dive into the research topics of 'Finding a most biased coin with fewest flips'. Together they form a unique fingerprint.

Cite this