TY - JOUR
T1 - Real-time Bayesian anomaly detection in streaming environmental data
AU - Hill, David J.
AU - Minsker, Barbara S.
AU - Amir, Eyal
PY - 2009/4/1
Y1 - 2009/4/1
N2 - With large volumes of data arriving in near real time from environmental sensors, there is a need for automated detection of anomalous data caused by sensor or transmission errors or by infrequent system behaviors. This study develops and evaluates three automated anomaly detection methods using dynamic Bayesian networks (DBNs), which perform fast, incremental evaluation of data as they become available, scale to large quantities of data, and require no a priori information regarding process variables or types of anomalies that may be encountered. This study investigates these methods' abilities to identify anomalies in eight meteorological data streams from Corpus Christi, Texas. The results indicate that DBN-based detectors, using either robust Kalman filtering or Rao-Blackwellized particle filtering, outperform a DBN-based detector using Kalman filtering, with the former having false positive/negative rates of less than 2%. These methods were successful at identifying data anomalies caused by two real events: a sensor failure and a large storm.
AB - With large volumes of data arriving in near real time from environmental sensors, there is a need for automated detection of anomalous data caused by sensor or transmission errors or by infrequent system behaviors. This study develops and evaluates three automated anomaly detection methods using dynamic Bayesian networks (DBNs), which perform fast, incremental evaluation of data as they become available, scale to large quantities of data, and require no a priori information regarding process variables or types of anomalies that may be encountered. This study investigates these methods' abilities to identify anomalies in eight meteorological data streams from Corpus Christi, Texas. The results indicate that DBN-based detectors, using either robust Kalman filtering or Rao-Blackwellized particle filtering, outperform a DBN-based detector using Kalman filtering, with the former having false positive/negative rates of less than 2%. These methods were successful at identifying data anomalies caused by two real events: a sensor failure and a large storm.
UR - http://www.scopus.com/inward/record.url?scp=79551660724&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79551660724&partnerID=8YFLogxK
U2 - 10.1029/2008WR006956
DO - 10.1029/2008WR006956
M3 - Article
AN - SCOPUS:79551660724
SN - 0043-1397
VL - 46
JO - Water Resources Research
JF - Water Resources Research
IS - 4
M1 - W00D28
ER -