Adaptive Step-Size Methods for Compressed SGD With Memory Feedback

Adarsh M. Subramaniam, Akshayaa Magesh, Venugopal V. Veeravalli

Research output: Contribution to journalArticlepeer-review

Abstract

Distributed and decentralized optimization problems, such as those encountered in federated learning, often suffer from a communication bottleneck at compute nodes when a large number of messages get aggregated. To address this communication bottleneck, compressed Stochastic Gradient Descent (SGD) algorithms have been proposed. Most existing compressed SGD algorithms use non-adaptive step-sizes (constant or diminishing) that depend on unknown system parameters to provide theoretical convergence guarantees, with the step-sizes being fine-tuned in practice to obtain good empirical performance for a given dataset and learning model. Such fine-tuning might be impractical in many scenarios. Therefore, it is of interest to study compressed SGD using adaptive step-sizes that do not depend on unknown system parameters. Motivated by prior work that employs adaptive step-sizes for uncompressed SGD, we develop an Armijo rule based step-size selection method for compressed SGD with feedback. In particular, we first motivate a novel scaling technique for Gradient Descent (GD) with Armijo step-size search, based on which we develop an Armijo step-search method with scaling for the descent step of the compressed SGD algorithm with memory feedback. We then establish that our algorithm has order-optimal convergence rates for convex-smooth and strong convex-smooth objectives under an interpolation condition, and for non-convex objectives under a strong growth condition. Experiments comparing our proposed algorithm with state-of-the-art compressed SGD methods with memory feedback demonstrate a notable improvement in terms of training loss across various levels of compression.

Original languageEnglish (US)
Pages (from-to)2394-2406
Number of pages13
JournalIEEE Transactions on Signal Processing
Volume72
DOIs
StatePublished - 2024

Keywords

  • Adaptive step-sizes
  • compression
  • distributed optimization
  • memory feedback
  • stochastic gradient descent

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Adaptive Step-Size Methods for Compressed SGD With Memory Feedback'. Together they form a unique fingerprint.

Cite this