Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation

Pedro Cisneros-Velarde, Sanmi Koyejo

Research output: Contribution to journalConference articlepeer-review

Abstract

Nash Q-learning may be considered one of the first and most known algorithms in multi-agent reinforcement learning (MARL) for learning policies that constitute a Nash equilibrium of an underlying general-sum Markov game. Its original proof provided asymptotic guarantees and was for the tabular case. Recently, finite-sample guarantees have been provided using more modern RL techniques for the tabular case. Our work analyzes Nash Q-learning using linear function approximation - a representation regime introduced when the state space is large or continuous - and provides finite-sample guarantees that indicate its sample efficiency. We find that the obtained performance nearly matches an existing efficient result for single-agent RL under the same representation and has a polynomial gap when compared to the best-known result for the tabular case.

Original languageEnglish (US)
Pages (from-to)424-432
Number of pages9
JournalProceedings of Machine Learning Research
Volume216
StatePublished - 2023
Externally publishedYes
Event39th Conference on Uncertainty in Artificial Intelligence, UAI 2023 - Pittsburgh, United States
Duration: Jul 31 2023Aug 4 2023

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'Finite-sample Guarantees for Nash Q-learning with Linear Function Approximation'. Together they form a unique fingerprint.

Cite this