Universal outlier hypothesis testing: Application to anomaly detection

Yun Li, Sirin Nitinawarat, Yu Su, Venugopal Varadachari Veeravalli

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In outlier hypothesis testing, multiple observation sequences are collected, a small subset of which are outliers. Observations in an outlier sequence are generated by a mechanism different from that generating the observations in the majority of sequences. The goal is to best discern all the outlier sequences without any knowledge of the underlying generating mechanisms. A generalized likelihood test is considered in the fixed sample size setting. In the sequential setting, a test based on the Multihypothesis Sequential Probability Ratio Test and the repeated significance test is considered. The sequential test outperforms the generalized likelihood test when the lengths of the observation sequences exceed certain values. Applied to a real data set for spam detection, the performance of the proposed tests is shown to be superior to those based on the maximum mean discrepancy for large sample size.

Original languageEnglish (US)
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages5595-5599
Number of pages5
ISBN (Electronic)9781467369978
DOIs
StatePublished - Aug 4 2015
Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
Duration: Apr 19 2014Apr 24 2014

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2015-August
ISSN (Print)1520-6149

Other

Other40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
CountryAustralia
CityBrisbane
Period4/19/144/24/14

Fingerprint

Testing

Keywords

  • anomaly detection
  • generalized likelihood test
  • maximum mean discrepancy
  • multihypothesis sequential probability ratio test
  • universal outlier hypothesis testing

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Li, Y., Nitinawarat, S., Su, Y., & Veeravalli, V. V. (2015). Universal outlier hypothesis testing: Application to anomaly detection. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings (pp. 5595-5599). [7179042] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2015-August). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2015.7179042

Universal outlier hypothesis testing : Application to anomaly detection. / Li, Yun; Nitinawarat, Sirin; Su, Yu; Veeravalli, Venugopal Varadachari.

2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2015. p. 5595-5599 7179042 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2015-August).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, Y, Nitinawarat, S, Su, Y & Veeravalli, VV 2015, Universal outlier hypothesis testing: Application to anomaly detection. in 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings., 7179042, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2015-August, Institute of Electrical and Electronics Engineers Inc., pp. 5595-5599, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015, Brisbane, Australia, 4/19/14. https://doi.org/10.1109/ICASSP.2015.7179042
Li Y, Nitinawarat S, Su Y, Veeravalli VV. Universal outlier hypothesis testing: Application to anomaly detection. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2015. p. 5595-5599. 7179042. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2015.7179042
Li, Yun ; Nitinawarat, Sirin ; Su, Yu ; Veeravalli, Venugopal Varadachari. / Universal outlier hypothesis testing : Application to anomaly detection. 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 5595-5599 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{1546227db04843d892b8ebdd1871de9f,
title = "Universal outlier hypothesis testing: Application to anomaly detection",
abstract = "In outlier hypothesis testing, multiple observation sequences are collected, a small subset of which are outliers. Observations in an outlier sequence are generated by a mechanism different from that generating the observations in the majority of sequences. The goal is to best discern all the outlier sequences without any knowledge of the underlying generating mechanisms. A generalized likelihood test is considered in the fixed sample size setting. In the sequential setting, a test based on the Multihypothesis Sequential Probability Ratio Test and the repeated significance test is considered. The sequential test outperforms the generalized likelihood test when the lengths of the observation sequences exceed certain values. Applied to a real data set for spam detection, the performance of the proposed tests is shown to be superior to those based on the maximum mean discrepancy for large sample size.",
keywords = "anomaly detection, generalized likelihood test, maximum mean discrepancy, multihypothesis sequential probability ratio test, universal outlier hypothesis testing",
author = "Yun Li and Sirin Nitinawarat and Yu Su and Veeravalli, {Venugopal Varadachari}",
year = "2015",
month = "8",
day = "4",
doi = "10.1109/ICASSP.2015.7179042",
language = "English (US)",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "5595--5599",
booktitle = "2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings",
address = "United States",

}

TY - GEN

T1 - Universal outlier hypothesis testing

T2 - Application to anomaly detection

AU - Li, Yun

AU - Nitinawarat, Sirin

AU - Su, Yu

AU - Veeravalli, Venugopal Varadachari

PY - 2015/8/4

Y1 - 2015/8/4

N2 - In outlier hypothesis testing, multiple observation sequences are collected, a small subset of which are outliers. Observations in an outlier sequence are generated by a mechanism different from that generating the observations in the majority of sequences. The goal is to best discern all the outlier sequences without any knowledge of the underlying generating mechanisms. A generalized likelihood test is considered in the fixed sample size setting. In the sequential setting, a test based on the Multihypothesis Sequential Probability Ratio Test and the repeated significance test is considered. The sequential test outperforms the generalized likelihood test when the lengths of the observation sequences exceed certain values. Applied to a real data set for spam detection, the performance of the proposed tests is shown to be superior to those based on the maximum mean discrepancy for large sample size.

AB - In outlier hypothesis testing, multiple observation sequences are collected, a small subset of which are outliers. Observations in an outlier sequence are generated by a mechanism different from that generating the observations in the majority of sequences. The goal is to best discern all the outlier sequences without any knowledge of the underlying generating mechanisms. A generalized likelihood test is considered in the fixed sample size setting. In the sequential setting, a test based on the Multihypothesis Sequential Probability Ratio Test and the repeated significance test is considered. The sequential test outperforms the generalized likelihood test when the lengths of the observation sequences exceed certain values. Applied to a real data set for spam detection, the performance of the proposed tests is shown to be superior to those based on the maximum mean discrepancy for large sample size.

KW - anomaly detection

KW - generalized likelihood test

KW - maximum mean discrepancy

KW - multihypothesis sequential probability ratio test

KW - universal outlier hypothesis testing

UR - http://www.scopus.com/inward/record.url?scp=84946074694&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946074694&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2015.7179042

DO - 10.1109/ICASSP.2015.7179042

M3 - Conference contribution

AN - SCOPUS:84946074694

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 5595

EP - 5599

BT - 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -