Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic Mean-Field Games

Muhammad Aneeq Uz Zaman, Kaiqing Zhang, Erik Miehling, Tamer Basar

Research output: Chapter in Book/Report/Conference proceedingConference contribution


In this paper, we study large population multiagent reinforcement learning (RL) in the context of discretetime linear-quadratic mean-field games (LQ-MFGs). Our setting differs from most existing work on RL for MFGs, in that we consider a non-stationary MFG over an infinite horizon. We propose an actor-critic algorithm to iteratively compute the mean-field equilibrium (MFE) of the LQ-MFG. There are two primary challenges: i) the non-stationarity of the MFG induces a linear-quadratic tracking problem, which requires solving a backwards-in-time (non-causal) equation that cannot be solved by standard (causal) RL algorithms; ii) Many RL algorithms assume that the states are sampled from the stationary distribution of a Markov chain (MC), that is, the chain is already mixed, an assumption that is not satisfied for real data sources. We first identify that the mean-field trajectory follows linear dynamics, allowing the problem to be reformulated as a linear quadratic Gaussian problem. Under this reformulation, we propose an actor-critic algorithm that allows samples to be drawn from an unmixed MC. Finite-sample convergence guarantees for the algorithm are then provided. To characterize the performance of our algorithm in multi-agent RL, we have developed an error bound with respect to the Nash equilibrium of the finite- population game.

Original languageEnglish (US)
Title of host publication2020 59th IEEE Conference on Decision and Control, CDC 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages7
ISBN (Electronic)9781728174471
StatePublished - Dec 14 2020
Event59th IEEE Conference on Decision and Control, CDC 2020 - Virtual, Jeju Island, Korea, Republic of
Duration: Dec 14 2020Dec 18 2020

Publication series

NameProceedings of the IEEE Conference on Decision and Control
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370


Conference59th IEEE Conference on Decision and Control, CDC 2020
Country/TerritoryKorea, Republic of
CityVirtual, Jeju Island

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Modeling and Simulation
  • Control and Optimization


Dive into the research topics of 'Reinforcement Learning in Non-Stationary Discrete-Time Linear-Quadratic Mean-Field Games'. Together they form a unique fingerprint.

Cite this