Root cause diagnosis in error-propagating networks

Eunsoo Seo, Gulustan Dogan, Tarek Abdelzaher, Theodore Brown

Research output: Contribution to journalArticle

Abstract

Various types of errors can propagate in networks, and they are usually hard to diagnose. For example, social networks spread rumors as well as useful information. Computer networks can spread Internet worms or malicious packets. In many cases, it is very hard to find the root cause (a.k.a. initial rumor spreader) of such errors without complete knowledge of the error propagation. We aim to find the root cause node when there is limited information about error propagation. We assume that there are very small number of monitor nodes in the network reporting whether error reached them or not. With this assumption, we first propose an algorithm that finds the most probable root cause node. Second, to improve the accuracy of root cause analysis, we propose another algorithm that makes use of timestamp of error reception. Finally, we study how to select monitors effectively so that root cause analysis can be accurate. With real networks from various domains, our algorithms are shown to be very effective.

Original languageEnglish (US)
Pages (from-to)1297-1308
Number of pages12
JournalSecurity and Communication Networks
Volume9
Issue number11
DOIs
StatePublished - Jul 25 2016

Fingerprint

Spreaders
Computer networks
Internet

Keywords

  • distributed systems
  • error classification
  • error propagation
  • error-propagating networks
  • root cause diagnosis

ASJC Scopus subject areas

  • Information Systems
  • Computer Networks and Communications

Cite this

Root cause diagnosis in error-propagating networks. / Seo, Eunsoo; Dogan, Gulustan; Abdelzaher, Tarek; Brown, Theodore.

In: Security and Communication Networks, Vol. 9, No. 11, 25.07.2016, p. 1297-1308.

Research output: Contribution to journalArticle

Seo, Eunsoo ; Dogan, Gulustan ; Abdelzaher, Tarek ; Brown, Theodore. / Root cause diagnosis in error-propagating networks. In: Security and Communication Networks. 2016 ; Vol. 9, No. 11. pp. 1297-1308.
@article{4c521a15a6d449379bacd73f4f650c96,
title = "Root cause diagnosis in error-propagating networks",
abstract = "Various types of errors can propagate in networks, and they are usually hard to diagnose. For example, social networks spread rumors as well as useful information. Computer networks can spread Internet worms or malicious packets. In many cases, it is very hard to find the root cause (a.k.a. initial rumor spreader) of such errors without complete knowledge of the error propagation. We aim to find the root cause node when there is limited information about error propagation. We assume that there are very small number of monitor nodes in the network reporting whether error reached them or not. With this assumption, we first propose an algorithm that finds the most probable root cause node. Second, to improve the accuracy of root cause analysis, we propose another algorithm that makes use of timestamp of error reception. Finally, we study how to select monitors effectively so that root cause analysis can be accurate. With real networks from various domains, our algorithms are shown to be very effective.",
keywords = "distributed systems, error classification, error propagation, error-propagating networks, root cause diagnosis",
author = "Eunsoo Seo and Gulustan Dogan and Tarek Abdelzaher and Theodore Brown",
year = "2016",
month = "7",
day = "25",
doi = "10.1002/sec.1415",
language = "English (US)",
volume = "9",
pages = "1297--1308",
journal = "Security and Communication Networks",
issn = "1939-0122",
publisher = "John Wiley and Sons Inc.",
number = "11",

}

TY - JOUR

T1 - Root cause diagnosis in error-propagating networks

AU - Seo, Eunsoo

AU - Dogan, Gulustan

AU - Abdelzaher, Tarek

AU - Brown, Theodore

PY - 2016/7/25

Y1 - 2016/7/25

N2 - Various types of errors can propagate in networks, and they are usually hard to diagnose. For example, social networks spread rumors as well as useful information. Computer networks can spread Internet worms or malicious packets. In many cases, it is very hard to find the root cause (a.k.a. initial rumor spreader) of such errors without complete knowledge of the error propagation. We aim to find the root cause node when there is limited information about error propagation. We assume that there are very small number of monitor nodes in the network reporting whether error reached them or not. With this assumption, we first propose an algorithm that finds the most probable root cause node. Second, to improve the accuracy of root cause analysis, we propose another algorithm that makes use of timestamp of error reception. Finally, we study how to select monitors effectively so that root cause analysis can be accurate. With real networks from various domains, our algorithms are shown to be very effective.

AB - Various types of errors can propagate in networks, and they are usually hard to diagnose. For example, social networks spread rumors as well as useful information. Computer networks can spread Internet worms or malicious packets. In many cases, it is very hard to find the root cause (a.k.a. initial rumor spreader) of such errors without complete knowledge of the error propagation. We aim to find the root cause node when there is limited information about error propagation. We assume that there are very small number of monitor nodes in the network reporting whether error reached them or not. With this assumption, we first propose an algorithm that finds the most probable root cause node. Second, to improve the accuracy of root cause analysis, we propose another algorithm that makes use of timestamp of error reception. Finally, we study how to select monitors effectively so that root cause analysis can be accurate. With real networks from various domains, our algorithms are shown to be very effective.

KW - distributed systems

KW - error classification

KW - error propagation

KW - error-propagating networks

KW - root cause diagnosis

UR - http://www.scopus.com/inward/record.url?scp=84954290233&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84954290233&partnerID=8YFLogxK

U2 - 10.1002/sec.1415

DO - 10.1002/sec.1415

M3 - Article

AN - SCOPUS:84954290233

VL - 9

SP - 1297

EP - 1308

JO - Security and Communication Networks

JF - Security and Communication Networks

SN - 1939-0122

IS - 11

ER -