TY - JOUR
T1 - Universal sequential outlier hypothesis testing
AU - Li, Yun
AU - Nitinawarat, Sirin
AU - Veeravalli, Venugopal V.
N1 - Funding Information:
This work was supported by the Air Force O ce of Scientific Research (AFOSR) under Grant FA9550-10-1-0458 and by the National Science Foundation under grants CCF 11-11342 and CCF 16-18658, through the University of Illinois at Urbana–Champaign.
Publisher Copyright:
© 2017 Taylor & Francis.
PY - 2017/7/3
Y1 - 2017/7/3
N2 - Universal outlier hypothesis testing is studied in a sequential setting. Multiple observation sequences are collected, a small subset of which are outliers. A sequence is considered an outlier if the observations in that sequence are generated by an “outlier” distribution, distinct from a common “typical” distribution governing the majority of the sequences. Apart from being distinct, the outlier and typical distributions can be arbitrarily close. The goal is to design a universal test to best discern all the outlier sequences. A universal test with the flavor of the repeated significance test is proposed and its asymptotic performance, as the error probability goes to zero, is characterized under various universal settings. The proposed test is shown to be universally consistent. For the model with at most one outlier, conditioned on the outlier being present, the test is shown to be asymptotically optimal universally when the typical distribution is known and as the number of sequences goes to infinity when neither the outlier nor the typical distribution is known. With multiple identical outliers, the test is shown to be asymptotically optimal universally when the number of outliers is the largest possible and with the typical distribution being known, and its asymptotic performance with neither the outlier nor the typical distribution being known is also characterized. Extensions of the findings to models with multiple distinct outliers are also discussed. In all cases, it is shown that the asymptotic performance guarantees for the proposed test when neither the outlier nor the typical distribution is known converge to those when the typical distribution is known as the number of sequences goes to infinity.
AB - Universal outlier hypothesis testing is studied in a sequential setting. Multiple observation sequences are collected, a small subset of which are outliers. A sequence is considered an outlier if the observations in that sequence are generated by an “outlier” distribution, distinct from a common “typical” distribution governing the majority of the sequences. Apart from being distinct, the outlier and typical distributions can be arbitrarily close. The goal is to design a universal test to best discern all the outlier sequences. A universal test with the flavor of the repeated significance test is proposed and its asymptotic performance, as the error probability goes to zero, is characterized under various universal settings. The proposed test is shown to be universally consistent. For the model with at most one outlier, conditioned on the outlier being present, the test is shown to be asymptotically optimal universally when the typical distribution is known and as the number of sequences goes to infinity when neither the outlier nor the typical distribution is known. With multiple identical outliers, the test is shown to be asymptotically optimal universally when the number of outliers is the largest possible and with the typical distribution being known, and its asymptotic performance with neither the outlier nor the typical distribution being known is also characterized. Extensions of the findings to models with multiple distinct outliers are also discussed. In all cases, it is shown that the asymptotic performance guarantees for the proposed test when neither the outlier nor the typical distribution is known converge to those when the typical distribution is known as the number of sequences goes to infinity.
KW - Anomaly detection
KW - consistency
KW - data-driven classification
KW - exponential consistency
KW - fraud detection
KW - generalized likelihood test
KW - multihypothesis sequential probability ratio test
KW - nonparametric sequential testing
KW - outlier detection
KW - repeated significance test
UR - http://www.scopus.com/inward/record.url?scp=85029921551&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029921551&partnerID=8YFLogxK
U2 - 10.1080/07474946.2017.1360086
DO - 10.1080/07474946.2017.1360086
M3 - Article
AN - SCOPUS:85029921551
SN - 0747-4946
VL - 36
SP - 309
EP - 344
JO - Sequential Analysis
JF - Sequential Analysis
IS - 3
ER -