Networked Multi-Agent Reinforcement Learning in Continuous Spaces

Kaiqing Zhang, Zhuoran Yang, Tamer Basar

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Many real-world tasks on practical control systems involve the learning and decision-making of multiple agents, under limited communications and observations. In this paper, we study the problem of networked multi-agent reinforcement learning (MARL), where multiple agents perform reinforcement learning in a common environment, and are able to exchange information via a possibly time-varying communication network. In particular, we focus on a collaborative MARL setting where each agent has individual reward functions, and the objective of all the agents is to maximize the network-wide averaged long-term return. To this end, we propose a fully decentralized actor-critic algorithm that only relies on neighbor-to-neighbor communications among agents. To promote the use of the algorithm on practical control systems, we focus on the setting with continuous state and action spaces, and adopt the newly proposed expected policy gradient to reduce the variance of the gradient estimate. We provide convergence guarantees for the algorithm when linear function approximation is employed, and corroborate our theoretical results via simulations.

Original languageEnglish (US)
Title of host publication2018 IEEE Conference on Decision and Control, CDC 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages6
ISBN (Electronic)9781538613955
StatePublished - Jul 2 2018
Event57th IEEE Conference on Decision and Control, CDC 2018 - Miami, United States
Duration: Dec 17 2018Dec 19 2018

Publication series

NameProceedings of the IEEE Conference on Decision and Control
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370


Conference57th IEEE Conference on Decision and Control, CDC 2018
Country/TerritoryUnited States

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Modeling and Simulation
  • Control and Optimization


Dive into the research topics of 'Networked Multi-Agent Reinforcement Learning in Continuous Spaces'. Together they form a unique fingerprint.

Cite this