TY - GEN

T1 - Linear-complexity exponentially-consistent tests for universal outlying sequence detection

AU - Bu, Yuheng

AU - Zou, Shaofeng

AU - Veeravalli, Venugopal V.

N1 - Publisher Copyright:
© 2017 IEEE.

PY - 2017/8/9

Y1 - 2017/8/9

N2 - We study a universal outlying sequence detection problem, in which there are M sequences of samples out of which a small subset of outliers need to be detected. A sequence is considered as an outlier if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences. In the universal setting, the goal is to identify all the outliers without any knowledge about the underlying generating distributions. In prior work, this problem was studied as a universal hypothesis testing problem, and a generalized likelihood (GL) test was constructed and its asymptotic performance characterized. In this paper, we propose a different class of tests for this problem based on distribution clustering. Such tests are shown to be exponentially consistent and their time complexity is linear in the total number of sequences, in contrast with the GL test, which has time complexity that is exponential in the number of outliers. Furthermore, our tests based on clustering are applicable to more general scenarios. For example, when both the typical and outlier distributions form clusters, the clustering based test is exponentially consistent, but the GL test is not even applicable.

AB - We study a universal outlying sequence detection problem, in which there are M sequences of samples out of which a small subset of outliers need to be detected. A sequence is considered as an outlier if the observations therein are generated by a distribution different from those generating the observations in the majority of the sequences. In the universal setting, the goal is to identify all the outliers without any knowledge about the underlying generating distributions. In prior work, this problem was studied as a universal hypothesis testing problem, and a generalized likelihood (GL) test was constructed and its asymptotic performance characterized. In this paper, we propose a different class of tests for this problem based on distribution clustering. Such tests are shown to be exponentially consistent and their time complexity is linear in the total number of sequences, in contrast with the GL test, which has time complexity that is exponential in the number of outliers. Furthermore, our tests based on clustering are applicable to more general scenarios. For example, when both the typical and outlier distributions form clusters, the clustering based test is exponentially consistent, but the GL test is not even applicable.

UR - http://www.scopus.com/inward/record.url?scp=85029636754&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85029636754&partnerID=8YFLogxK

U2 - 10.1109/ISIT.2017.8006676

DO - 10.1109/ISIT.2017.8006676

M3 - Conference contribution

AN - SCOPUS:85029636754

T3 - IEEE International Symposium on Information Theory - Proceedings

SP - 988

EP - 992

BT - 2017 IEEE International Symposium on Information Theory, ISIT 2017

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2017 IEEE International Symposium on Information Theory, ISIT 2017

Y2 - 25 June 2017 through 30 June 2017

ER -