TY - GEN
T1 - Demystifying the Performance of Data Transfers in High-Performance Research Networks
AU - Saeedizade, Ehsan
AU - Zhang, Bing
AU - Arslan, Engin
N1 - The work in this study was supported in part by the NSF grants 2007789 and 2145742.
PY - 2023
Y1 - 2023
N2 - High-speed research networks are built to meet the ever-increasing needs of data-intensive distributed workflows. However, data transfers in these networks often fail to attain the promised transfer rates for several reasons, including I/O and network interference, server misconfigurations, and network anomalies. Although understanding the root causes of performance issues is critical to mitigating them and increasing the utilization of expensive network infrastructures, there is currently no available mechanism to monitor data transfers in these networks. In this paper, we present a scalable, end-to-end monitoring framework to gather and store key performance metrics for file transfers to shed light on the performance of transfers. The evaluation results show that the proposed framework can monitor up to 400 transfers per host and more than 40, 000 transfers in total while collecting performance statistics at one-second precision. We also introduce a heuristic method to automatically process the gathered performance metrics and identify the root causes of performance anomalies with an F-score of 87-98%.
AB - High-speed research networks are built to meet the ever-increasing needs of data-intensive distributed workflows. However, data transfers in these networks often fail to attain the promised transfer rates for several reasons, including I/O and network interference, server misconfigurations, and network anomalies. Although understanding the root causes of performance issues is critical to mitigating them and increasing the utilization of expensive network infrastructures, there is currently no available mechanism to monitor data transfers in these networks. In this paper, we present a scalable, end-to-end monitoring framework to gather and store key performance metrics for file transfers to shed light on the performance of transfers. The evaluation results show that the proposed framework can monitor up to 400 transfers per host and more than 40, 000 transfers in total while collecting performance statistics at one-second precision. We also introduce a heuristic method to automatically process the gathered performance metrics and identify the root causes of performance anomalies with an F-score of 87-98%.
UR - http://www.scopus.com/inward/record.url?scp=85174287232&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85174287232&partnerID=8YFLogxK
U2 - 10.1109/e-Science58273.2023.10254940
DO - 10.1109/e-Science58273.2023.10254940
M3 - Conference contribution
AN - SCOPUS:85174287232
T3 - Proceedings 2023 IEEE 19th International Conference on e-Science, e-Science 2023
BT - Proceedings 2023 IEEE 19th International Conference on e-Science, e-Science 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 19th IEEE International Conference on e-Science, e-Science 2023
Y2 - 9 October 2023 through 14 October 2023
ER -