Universal hypothesis testing in the learning-limited regime

Benjamin G. Kelly, Thitidej Tularak, Aaron B. Wagner, Pramod Viswanath

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Given training sequences generated by two distinct, but unknown distributions sharing a common alphabet, we seek a classifier that can correctly decide whether a third test sequence is generated by the first or second distribution using only the training data. To model 'limited learning' we allow the alphabet size to grow and therefore probability distributions to change with the blocklength. We prove that a natural choice, namely a generalized likelihood ratio test, is universally consistent (has a probability of error tending to zero with the blocklength for all underlying distributions) when the alphabet size is sub-linear in the blocklength, but inconsistent for linear alphabet growth. For up-to quadratic alphabet growth, in a regime where all probabilities are of the same order, we prove the universally consistency of a new test and show there are no such tests when the alphabet grows quadratically or faster.

Original languageEnglish (US)
Title of host publication2010 IEEE International Symposium on Information Theory, ISIT 2010 - Proceedings
Pages1478-1482
Number of pages5
DOIs
StatePublished - Aug 23 2010
Event2010 IEEE International Symposium on Information Theory, ISIT 2010 - Austin, TX, United States
Duration: Jun 13 2010Jun 18 2010

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
ISSN (Print)2157-8103

Other

Other2010 IEEE International Symposium on Information Theory, ISIT 2010
CountryUnited States
CityAustin, TX
Period6/13/106/18/10

Fingerprint

Hypothesis Testing
Testing
Probability distributions
Generalized Likelihood Ratio Test
Classifiers
Inconsistent
Sharing
Probability Distribution
Classifier
Distinct
Unknown
Zero
Learning
Training
Model

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Cite this

Kelly, B. G., Tularak, T., Wagner, A. B., & Viswanath, P. (2010). Universal hypothesis testing in the learning-limited regime. In 2010 IEEE International Symposium on Information Theory, ISIT 2010 - Proceedings (pp. 1478-1482). [5513583] (IEEE International Symposium on Information Theory - Proceedings). https://doi.org/10.1109/ISIT.2010.5513583

Universal hypothesis testing in the learning-limited regime. / Kelly, Benjamin G.; Tularak, Thitidej; Wagner, Aaron B.; Viswanath, Pramod.

2010 IEEE International Symposium on Information Theory, ISIT 2010 - Proceedings. 2010. p. 1478-1482 5513583 (IEEE International Symposium on Information Theory - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Kelly, BG, Tularak, T, Wagner, AB & Viswanath, P 2010, Universal hypothesis testing in the learning-limited regime. in 2010 IEEE International Symposium on Information Theory, ISIT 2010 - Proceedings., 5513583, IEEE International Symposium on Information Theory - Proceedings, pp. 1478-1482, 2010 IEEE International Symposium on Information Theory, ISIT 2010, Austin, TX, United States, 6/13/10. https://doi.org/10.1109/ISIT.2010.5513583
Kelly BG, Tularak T, Wagner AB, Viswanath P. Universal hypothesis testing in the learning-limited regime. In 2010 IEEE International Symposium on Information Theory, ISIT 2010 - Proceedings. 2010. p. 1478-1482. 5513583. (IEEE International Symposium on Information Theory - Proceedings). https://doi.org/10.1109/ISIT.2010.5513583
Kelly, Benjamin G. ; Tularak, Thitidej ; Wagner, Aaron B. ; Viswanath, Pramod. / Universal hypothesis testing in the learning-limited regime. 2010 IEEE International Symposium on Information Theory, ISIT 2010 - Proceedings. 2010. pp. 1478-1482 (IEEE International Symposium on Information Theory - Proceedings).
@inproceedings{0ceb54b213784d4ca9a14937648d98fa,
title = "Universal hypothesis testing in the learning-limited regime",
abstract = "Given training sequences generated by two distinct, but unknown distributions sharing a common alphabet, we seek a classifier that can correctly decide whether a third test sequence is generated by the first or second distribution using only the training data. To model 'limited learning' we allow the alphabet size to grow and therefore probability distributions to change with the blocklength. We prove that a natural choice, namely a generalized likelihood ratio test, is universally consistent (has a probability of error tending to zero with the blocklength for all underlying distributions) when the alphabet size is sub-linear in the blocklength, but inconsistent for linear alphabet growth. For up-to quadratic alphabet growth, in a regime where all probabilities are of the same order, we prove the universally consistency of a new test and show there are no such tests when the alphabet grows quadratically or faster.",
author = "Kelly, {Benjamin G.} and Thitidej Tularak and Wagner, {Aaron B.} and Pramod Viswanath",
year = "2010",
month = "8",
day = "23",
doi = "10.1109/ISIT.2010.5513583",
language = "English (US)",
isbn = "9781424469604",
series = "IEEE International Symposium on Information Theory - Proceedings",
pages = "1478--1482",
booktitle = "2010 IEEE International Symposium on Information Theory, ISIT 2010 - Proceedings",

}

TY - GEN

T1 - Universal hypothesis testing in the learning-limited regime

AU - Kelly, Benjamin G.

AU - Tularak, Thitidej

AU - Wagner, Aaron B.

AU - Viswanath, Pramod

PY - 2010/8/23

Y1 - 2010/8/23

N2 - Given training sequences generated by two distinct, but unknown distributions sharing a common alphabet, we seek a classifier that can correctly decide whether a third test sequence is generated by the first or second distribution using only the training data. To model 'limited learning' we allow the alphabet size to grow and therefore probability distributions to change with the blocklength. We prove that a natural choice, namely a generalized likelihood ratio test, is universally consistent (has a probability of error tending to zero with the blocklength for all underlying distributions) when the alphabet size is sub-linear in the blocklength, but inconsistent for linear alphabet growth. For up-to quadratic alphabet growth, in a regime where all probabilities are of the same order, we prove the universally consistency of a new test and show there are no such tests when the alphabet grows quadratically or faster.

AB - Given training sequences generated by two distinct, but unknown distributions sharing a common alphabet, we seek a classifier that can correctly decide whether a third test sequence is generated by the first or second distribution using only the training data. To model 'limited learning' we allow the alphabet size to grow and therefore probability distributions to change with the blocklength. We prove that a natural choice, namely a generalized likelihood ratio test, is universally consistent (has a probability of error tending to zero with the blocklength for all underlying distributions) when the alphabet size is sub-linear in the blocklength, but inconsistent for linear alphabet growth. For up-to quadratic alphabet growth, in a regime where all probabilities are of the same order, we prove the universally consistency of a new test and show there are no such tests when the alphabet grows quadratically or faster.

UR - http://www.scopus.com/inward/record.url?scp=77955700822&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77955700822&partnerID=8YFLogxK

U2 - 10.1109/ISIT.2010.5513583

DO - 10.1109/ISIT.2010.5513583

M3 - Conference contribution

AN - SCOPUS:77955700822

SN - 9781424469604

T3 - IEEE International Symposium on Information Theory - Proceedings

SP - 1478

EP - 1482

BT - 2010 IEEE International Symposium on Information Theory, ISIT 2010 - Proceedings

ER -