Universal sequential outlier hypothesis testing

Yun Li, Sirin Nitinawarat, Venugopal V. Veeravalli

Research output: Contribution to journalArticle

Abstract

Universal outlier hypothesis testing is studied in a sequential setting. Multiple observation sequences are collected, a small subset of which are outliers. A sequence is considered an outlier if the observations in that sequence are generated by an “outlier” distribution, distinct from a common “typical” distribution governing the majority of the sequences. Apart from being distinct, the outlier and typical distributions can be arbitrarily close. The goal is to design a universal test to best discern all the outlier sequences. A universal test with the flavor of the repeated significance test is proposed and its asymptotic performance, as the error probability goes to zero, is characterized under various universal settings. The proposed test is shown to be universally consistent. For the model with at most one outlier, conditioned on the outlier being present, the test is shown to be asymptotically optimal universally when the typical distribution is known and as the number of sequences goes to infinity when neither the outlier nor the typical distribution is known. With multiple identical outliers, the test is shown to be asymptotically optimal universally when the number of outliers is the largest possible and with the typical distribution being known, and its asymptotic performance with neither the outlier nor the typical distribution being known is also characterized. Extensions of the findings to models with multiple distinct outliers are also discussed. In all cases, it is shown that the asymptotic performance guarantees for the proposed test when neither the outlier nor the typical distribution is known converge to those when the typical distribution is known as the number of sequences goes to infinity.

Original languageEnglish (US)
Pages (from-to)309-344
Number of pages36
JournalSequential Analysis
Volume36
Issue number3
DOIs
StatePublished - Jul 3 2017

Keywords

  • Anomaly detection
  • consistency
  • data-driven classification
  • exponential consistency
  • fraud detection
  • generalized likelihood test
  • multihypothesis sequential probability ratio test
  • nonparametric sequential testing
  • outlier detection
  • repeated significance test

ASJC Scopus subject areas

  • Statistics and Probability
  • Modeling and Simulation

Fingerprint Dive into the research topics of 'Universal sequential outlier hypothesis testing'. Together they form a unique fingerprint.

  • Cite this