TY - GEN
T1 - Metric forensics
T2 - 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD-2010
AU - Henderson, Keith
AU - Eliassi-Rad, Tina
AU - Faloutsos, Christos
AU - Akoglu, Leman
AU - Li, Lei
AU - Maruhashi, Koji
AU - Prakash, B. Aditya
AU - Tong, Hanghang
PY - 2010
Y1 - 2010
N2 - Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not appropriate for such data, especially in cases where streaming or near-real-time results are required. An example that has drawn significant research interest is the cyber-security domain, where internet communication traces are collected and real-time discovery of events, behaviors, patterns, and anomalies is desired. We propose METRICFORENSICS, a scalable framework for analysis of volatile graphs. METRICFORENSICS combines a multi-level "drill down" approach, a collection of user-selected graph metrics, and a collection of analysis techniques. At each successive level, more sophisticated metrics are computed and the graph is viewed at finer temporal resolutions. In this way, METRICFORENSICS scales to highly volatile graphs by only allocating resources for computationally expensive analysis when an interesting event is discovered at a coarser resolution first. We test METRIC-FORENSICS on three real-world graphs: an enterprise IP trace, a trace of legitimate and malicious network traffic from a research institution, and the MIT Reality Mining proximity sensor data. Our largest graph has ∼3M vertices and ∼32M edges, spanning 4.5 days. The results demonstrate the scalability and capability of METRICFORENSICS in analyzing volatile graphs; and highlight four novel phenomena in such graphs: elbows, broken correlations, prolonged spikes, and lightweight stars.
AB - Advances in data collection and storage capacity have made it increasingly possible to collect highly volatile graph data for analysis. Existing graph analysis techniques are not appropriate for such data, especially in cases where streaming or near-real-time results are required. An example that has drawn significant research interest is the cyber-security domain, where internet communication traces are collected and real-time discovery of events, behaviors, patterns, and anomalies is desired. We propose METRICFORENSICS, a scalable framework for analysis of volatile graphs. METRICFORENSICS combines a multi-level "drill down" approach, a collection of user-selected graph metrics, and a collection of analysis techniques. At each successive level, more sophisticated metrics are computed and the graph is viewed at finer temporal resolutions. In this way, METRICFORENSICS scales to highly volatile graphs by only allocating resources for computationally expensive analysis when an interesting event is discovered at a coarser resolution first. We test METRIC-FORENSICS on three real-world graphs: an enterprise IP trace, a trace of legitimate and malicious network traffic from a research institution, and the MIT Reality Mining proximity sensor data. Our largest graph has ∼3M vertices and ∼32M edges, spanning 4.5 days. The results demonstrate the scalability and capability of METRICFORENSICS in analyzing volatile graphs; and highlight four novel phenomena in such graphs: elbows, broken correlations, prolonged spikes, and lightweight stars.
KW - Graph mining
KW - Temporal analysis
KW - Volatile graphs
UR - http://www.scopus.com/inward/record.url?scp=77956201937&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=77956201937&partnerID=8YFLogxK
U2 - 10.1145/1835804.1835828
DO - 10.1145/1835804.1835828
M3 - Conference contribution
AN - SCOPUS:77956201937
SN - 9781450300551
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 163
EP - 172
BT - KDD'10 - Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data
Y2 - 25 July 2010 through 28 July 2010
ER -