TY - GEN
T1 - Collaborative Inference in Resource-Constrained Edge Networks
T2 - 2024 IEEE Military Communications Conference, MILCOM 2024
AU - Ng, Nathan
AU - Souza, Abel
AU - Diggavi, Suhas
AU - Suri, Niranjan
AU - Abdelzaher, Tarek
AU - Towsley, Don
AU - Shenoy, Prashant
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Many IoT applications have increasingly adopted machine learning (ML) techniques, such as classification and detection, to enhance automation and decision-making processes. With advances in hardware accelerators such as Nvidia's Jetson embedded GPUs, the computational capabilities of end devices, particularly for ML inference workloads, have significantly improved in recent years. These advances have opened opportunities for distributing computation across the edge network, enabling optimal resource utilization and reducing request latency. Previous research has demonstrated promising results in collaborative inference, where processing units in the edge network, such as end devices and edge servers, collaboratively execute an inference request to minimize latency.This paper explores approaches for implementing collaborative inference on a single model in resource-constrained edge networks, including on-device, device-edge, and edge-edge collaboration. We present preliminary results from proof-of-concept experiments to support each case. We discuss dynamic factors that can impact the performance of these inference execution strategies, such as network variability, thermal constraints, and workload fluctuations. Finally, we outline potential directions for future research.
AB - Many IoT applications have increasingly adopted machine learning (ML) techniques, such as classification and detection, to enhance automation and decision-making processes. With advances in hardware accelerators such as Nvidia's Jetson embedded GPUs, the computational capabilities of end devices, particularly for ML inference workloads, have significantly improved in recent years. These advances have opened opportunities for distributing computation across the edge network, enabling optimal resource utilization and reducing request latency. Previous research has demonstrated promising results in collaborative inference, where processing units in the edge network, such as end devices and edge servers, collaboratively execute an inference request to minimize latency.This paper explores approaches for implementing collaborative inference on a single model in resource-constrained edge networks, including on-device, device-edge, and edge-edge collaboration. We present preliminary results from proof-of-concept experiments to support each case. We discuss dynamic factors that can impact the performance of these inference execution strategies, such as network variability, thermal constraints, and workload fluctuations. Finally, we outline potential directions for future research.
KW - Collaborative inference
KW - edge computing
UR - http://www.scopus.com/inward/record.url?scp=85214555344&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85214555344&partnerID=8YFLogxK
U2 - 10.1109/MILCOM61039.2024.10773876
DO - 10.1109/MILCOM61039.2024.10773876
M3 - Conference contribution
AN - SCOPUS:85214555344
T3 - Proceedings - IEEE Military Communications Conference MILCOM
BT - 2024 IEEE Military Communications Conference, MILCOM 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 28 October 2024 through 1 November 2024
ER -