TY - GEN
T1 - IntSight
T2 - 16th ACM Conference on Emerging Networking Experiment and Technologies, CoNEXT 2020
AU - Marques, Jonatas
AU - Levchenko, Kirill
AU - Gaspary, Luciano
N1 - We thank the CoNEXT reviewers and shepherd for their insightful comments and constructive feedback. This work was supported in part by CNPq - Brazil, NSF - USA (CNS-1740911), RNP - Brazil, CAPES - Brazil (Finance Code 1), and FAPESP - Brazil (15/24494-8).
PY - 2020/11/23
Y1 - 2020/11/23
N2 - Performance requirements for many of today's high-perfor-mance networks are expressed as service-level objectives (SLOs), i.e., precise guarantees, typically on latency and bandwidth, that a user can expect from the network. For network operators, monitoring their own SLO compliance, and quickly diagnosing any violations, is a critical element for effective operations. Unfortunately, existing network architectures are not engineered for this purpose; there is no mechanism, for example, for the operator to monitor the 95th per-centile latency experienced by a customer. Data plane programmability has made per-packet measurements possible but brings the challenge of keeping the monitoring overhead low and practical. In this paper, we present IntSight, a system for highly accurate and fine-grained detection and diagnosis of SLO violations. The main contribution of IntSight is, building upon in-band telemetry, introducing path-wise computation of network metrics and selective generation of reports. We show the effectiveness of IntSight by way of two use cases. Our evaluation using real networks also shows that IntSight generates up to two orders of magnitude less monitoring traffic than state-of-the-art approaches. Furthermore, its processing and memory requirements are low and therefore compatible with currently existing programmable platforms.
AB - Performance requirements for many of today's high-perfor-mance networks are expressed as service-level objectives (SLOs), i.e., precise guarantees, typically on latency and bandwidth, that a user can expect from the network. For network operators, monitoring their own SLO compliance, and quickly diagnosing any violations, is a critical element for effective operations. Unfortunately, existing network architectures are not engineered for this purpose; there is no mechanism, for example, for the operator to monitor the 95th per-centile latency experienced by a customer. Data plane programmability has made per-packet measurements possible but brings the challenge of keeping the monitoring overhead low and practical. In this paper, we present IntSight, a system for highly accurate and fine-grained detection and diagnosis of SLO violations. The main contribution of IntSight is, building upon in-band telemetry, introducing path-wise computation of network metrics and selective generation of reports. We show the effectiveness of IntSight by way of two use cases. Our evaluation using real networks also shows that IntSight generates up to two orders of magnitude less monitoring traffic than state-of-the-art approaches. Furthermore, its processing and memory requirements are low and therefore compatible with currently existing programmable platforms.
UR - http://www.scopus.com/inward/record.url?scp=85097603486&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85097603486&partnerID=8YFLogxK
U2 - 10.1145/3386367.3431306
DO - 10.1145/3386367.3431306
M3 - Conference contribution
AN - SCOPUS:85097603486
T3 - CoNEXT 2020 - Proceedings of the 16th International Conference on Emerging Networking EXperiments and Technologies
SP - 421
EP - 434
BT - CoNEXT 2020 - Proceedings of the 16th International Conference on Emerging Networking EXperiments and Technologies
PB - Association for Computing Machinery
Y2 - 1 December 2020 through 4 December 2020
ER -