Failure-Resilient ML Inference at the Edge through Graceful Service Degradation

Walid A. Hanafy, Li Wu, Tarek Abdelzaher, Suhas Diggavi, Prashant Shenoy

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With recent innovations in machine learning (ML) technologies, especially deep learning, many IoT applications have increasingly relied on ML models for various tasks, such as classification, detection, and decision-making. Most of these tasks are latency-sensitive and depend on models deployed at the edge of the network. Network and edge devices are prone to various kinds of failures, such as transient, crash, or Byzantine failures. Such failures can impact the IoT device's ability to offload tasks, affecting the system's reliability. A traditional solution involves replicating the underlying resources and deploying a failover replica of the ML model. However, edge resources are typically limited, and increasing their size incurs significant computational and infrastructure cost overheads.This paper proposes a range of failover strategies for resource-constrained edge environments, leveraging the flexibility offered by ML models. We explore various approaches for graceful service degradation, such as degraded accuracy, latency, and sampling rate, and highlight their potential benefits and trade-offs. Furthermore, we discuss the challenges associated with these techniques and outline future directions.

Original languageEnglish (US)
Title of host publicationMILCOM 2023 - 2023 IEEE Military Communications Conference
Subtitle of host publicationCommunications Supporting Military Operations in a Contested Environment
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages144-149
Number of pages6
ISBN (Electronic)9798350321814
DOIs
StatePublished - 2023
Event2023 IEEE Military Communications Conference, MILCOM 2023 - Boston, United States
Duration: Oct 30 2023Nov 3 2023

Publication series

NameMILCOM 2023 - 2023 IEEE Military Communications Conference: Communications Supporting Military Operations in a Contested Environment

Conference

Conference2023 IEEE Military Communications Conference, MILCOM 2023
Country/TerritoryUnited States
CityBoston
Period10/30/2311/3/23

Keywords

  • Edge computing
  • Graceful degradation
  • ML inference
  • Replication
  • Resilience

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Signal Processing
  • Safety, Risk, Reliability and Quality

Fingerprint

Dive into the research topics of 'Failure-Resilient ML Inference at the Edge through Graceful Service Degradation'. Together they form a unique fingerprint.

Cite this