Finite-sample analysis for decentralized cooperative multi-agent reinforcement learning from batch data

Kaiqing Zhang, Zhuoran Yang, Han Liu, Tong Zhang, Tamer Basar

Research output: Contribution to journalConference articlepeer-review

Abstract

In contrast to its great empirical success, theoretical understanding of multi-agent reinforcement learning (MARL) remains largely underdeveloped. As an initial attempt, we provide a finite-sample analysis for decentralized cooperative MARL with networked agents. In particular, we consider a team of cooperative agents connected by a time-varying communication network, with no central controller coordinating them. The goal for each agent is to maximize the long-term return associated with the team-average reward, by communicating only with its neighbors over the network. A batch MARL algorithm is developed for this setting, which can be implemented in a decentralized fashion. We then quantify the estimation errors of the action-value functions obtained from our algorithm, establishing their dependence on the function class, the number of samples in each iteration, and the number of iterations. This work appears to be the first finite-sample analysis for decentralized cooperative MARL from batch data.

Original languageEnglish (US)
Pages (from-to)1049-1056
Number of pages8
JournalIFAC-PapersOnLine
Volume53
Issue number2
DOIs
StatePublished - 2020
Event21st IFAC World Congress 2020 - Berlin, Germany
Duration: Jul 12 2020Jul 17 2020

Keywords

  • Decentralized optimization
  • Finite-Sample Analysis
  • Multi-Agent Systems
  • Networked systems
  • Reinforcement learning

ASJC Scopus subject areas

  • Control and Systems Engineering

Fingerprint Dive into the research topics of 'Finite-sample analysis for decentralized cooperative multi-agent reinforcement learning from batch data'. Together they form a unique fingerprint.

Cite this