TY - GEN
T1 - Linear-complexity exponentially-consistent tests for universal outlying sequence detection
AU - Bu, Yuheng
AU - Zou, Shaofeng
AU - Veeravalli, Venugopal V.
N1 - Publisher Copyright:
© 2017 IEEE.
PY - 2017/8/9
Y1 - 2017/8/9
N2 - We study a universal outlying sequence detection problem, in which there are M sequences of samples out of which a small subset of outliers need to be detected. A sequence is considered as an outlier if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences. In the universal setting, the goal is to identify all the outliers without any knowledge about the underlying generating distributions. In prior work, this problem was studied as a universal hypothesis testing problem, and a generalized likelihood (GL) test was constructed and its asymptotic performance characterized. In this paper, we propose a different class of tests for this problem based on distribution clustering. Such tests are shown to be exponentially consistent and their time complexity is linear in the total number of sequences, in contrast with the GL test, which has time complexity that is exponential in the number of outliers. Furthermore, our tests based on clustering are applicable to more general scenarios. For example, when both the typical and outlier distributions form clusters, the clustering based test is exponentially consistent, but the GL test is not even applicable.
AB - We study a universal outlying sequence detection problem, in which there are M sequences of samples out of which a small subset of outliers need to be detected. A sequence is considered as an outlier if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences. In the universal setting, the goal is to identify all the outliers without any knowledge about the underlying generating distributions. In prior work, this problem was studied as a universal hypothesis testing problem, and a generalized likelihood (GL) test was constructed and its asymptotic performance characterized. In this paper, we propose a different class of tests for this problem based on distribution clustering. Such tests are shown to be exponentially consistent and their time complexity is linear in the total number of sequences, in contrast with the GL test, which has time complexity that is exponential in the number of outliers. Furthermore, our tests based on clustering are applicable to more general scenarios. For example, when both the typical and outlier distributions form clusters, the clustering based test is exponentially consistent, but the GL test is not even applicable.
UR - http://www.scopus.com/inward/record.url?scp=85029636754&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85029636754&partnerID=8YFLogxK
U2 - 10.1109/ISIT.2017.8006676
DO - 10.1109/ISIT.2017.8006676
M3 - Conference contribution
AN - SCOPUS:85029636754
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 988
EP - 992
BT - 2017 IEEE International Symposium on Information Theory, ISIT 2017
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2017 IEEE International Symposium on Information Theory, ISIT 2017
Y2 - 25 June 2017 through 30 June 2017
ER -