Deep compressive offloading: Speeding up neural network inference by trading edge computation for network latency

Shuochao Yao, Jinyang Li, Dongxin Liu, Tianshi Wang, Shengzhong Liu, Huajie Shao, Tarek Abdelzaher

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

With recent advances, neural networks have become a crucial building block in intelligent IoT systems and sensing applications. However, the excessive computational demand remains a serious impediment to their deployments on low-end IoT devices. With the emergence of edge computing, offloading grows into a promising technique to circumvent end-device limitations. However, transferring data between local and edge devices takes up a large proportion of time in existing offloading frameworks, creating a bottleneck for low-latency intelligent services. In this work, we propose a general framework, called deep compressive offloading. By integrating compressive sensing theory and deep learning, our framework can encode data for offloading into tiny sizes with negligible overhead on local devices and decode the data on the edge server, while offering theoretical guarantees on perfect reconstruction and lossless inference. By trading edge computing resources for data transmission time, our design can significantly reduce offloading latency with almost no accuracy loss. We build a deep compressive offloading system to serve state-of-the-art computer vision and speech recognition services. With comprehensive evaluations, our system can consistently reduce end-to-end latency by 2X to 4X with 1% accuracy loss, compared to state-of-the-art neural network offloading systems. In conditions of limited network bandwidth or intensive background traffic, our system can further speed up the neural network inference by up to 35X 1.

Original languageEnglish (US)
Title of host publicationSenSys 2020 - Proceedings of the 2020 18th ACM Conference on Embedded Networked Sensor Systems
PublisherAssociation for Computing Machinery, Inc
Pages476-488
Number of pages13
ISBN (Electronic)9781450375900
DOIs
StatePublished - Nov 16 2020
Event18th ACM Conference on Embedded Networked Sensor Systems, SenSys 2020 - Virtual, Online, Japan
Duration: Nov 16 2020Nov 19 2020

Publication series

NameSenSys 2020 - Proceedings of the 2020 18th ACM Conference on Embedded Networked Sensor Systems

Conference

Conference18th ACM Conference on Embedded Networked Sensor Systems, SenSys 2020
CountryJapan
CityVirtual, Online
Period11/16/2011/19/20

Keywords

  • compressive offloading
  • compressive sensing
  • deep learning
  • edge computing
  • internet of things
  • offloading

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Deep compressive offloading: Speeding up neural network inference by trading edge computation for network latency'. Together they form a unique fingerprint.

Cite this