Adaptive Step-Size Methods for Compressed SGD

Adarsh M. Subramaniam, Akshayaa Magesh, Venugopal V. Veeravalli

Research output: Contribution to journalConference articlepeer-review


Compressed Stochastic Gradient Descent (SGD) algorithms have been proposed to address the communication bottleneck in distributed and decentralized optimization problems such as federated machine learning. Many existing compressed SGD algorithms use non-adaptive step-sizes (constant or diminishing) to provide theoretical convergence guarantees. Since non-adaptive step-sizes typically involve unknown system parameters, the step-sizes are fine-tuned in practice to obtain good empirical performance for the given dataset and learning model. Such fine-tuning might be impractical in many scenarios. Thus, it is of interest to study compressed SGD using adaptive step-sizes. Motivated by prior work that use adaptive step-sizes for uncompressed SGD, we develop an Armijo rule based step-size selection method for compressed SGD. In particular, we introduce a scaling technique for the descent step, which we use to establish order-optimal convergence rates for convex-smooth and strong convex-smooth objectives under an interpolation condition, and for non-convex objectives under a strong growth condition. We present experimental results on deep neural networks trained on real-world datasets, and compare the performance of our proposed algorithm with state-of-the-art compressed SGD methods to demonstrate improved performance at various levels of compression.


  • Adaptive step-size
  • Armijo rule
  • compressed SGD
  • federated learning
  • scaling
  • stochastic gradient descent

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering


Dive into the research topics of 'Adaptive Step-Size Methods for Compressed SGD'. Together they form a unique fingerprint.

Cite this