Abstract

Stochastic gradient descent is a common algorithm used in machine learning, especially when the loss function is in a separable form. Here, we consider SGD for constrained convex optimization problems, where the constraint is expressed as an intersection of a large number of convex sets. It is expensive to project the result of the gradient descent algorithm on each of these convex sets at each iteration; thus, there is a growing body of work which considers projections onto a random subset of the convex sets. In this paper, we consider a distributed version (parameter-server model) of the random projections since a centralized approach is too slow for very large-scale problems. Our main result is as follows: under a mild regularity condition on the convex sets, we show that the rate of convergence of distributed SGD with distributed random projections is the same as that of distributed SGD applied to a problem with no constraints, except for a factor which captures the regularity assumption.

Original languageEnglish (US)
Pages (from-to)373-378
Number of pages6
JournalIFAC-PapersOnLine
Volume51
Issue number23
DOIs
StatePublished - Jan 1 2018

Keywords

  • Distributed optimization
  • random projections

ASJC Scopus subject areas

  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Convergence Rate of Distributed Random Projections'. Together they form a unique fingerprint.

Cite this