Maximum-Entropy Multi-Agent Dynamic Games: Forward and Inverse Solutions

Negar Mehr, Mingyu Wang, Maulik Bhatt, Mac Schwager

Research output: Contribution to journalArticlepeer-review

Abstract

In this article, we study the problem of multiple stochastic agents interacting in a dynamic game scenario with continuous state and action spaces. We define a new notion of stochastic Nash equilibrium for boundedly rational agents, which we call the entropic cost equilibrium (ECE). We show that ECE is a natural extension to multiple agents of maximum entropy optimality for a single agent. We solve both the "forward" and "inverse" problems for the multi-agent ECE game. For the forward problem, we provide a Riccati algorithm to compute closedform ECE feedback policies for the agents, which are exact in the linear-quadratic-gaussian case. We give an iterative variant to find locally ECE feedback policies for the nonlinear case. For the inverse problem, we present an algorithm to infer the cost functions of the multiple interacting agents given noisy, boundedly rational input and state trajectory examples from agents acting in an ECE. The effectiveness of our algorithms is demonstrated in a simulated multi-agent collision avoidance scenario, and with data from the INTERACTION traffic dataset. In both cases, we show that, by taking into account the agents' game theoretic interactions using our algorithm, a more accurate model of agents' costs can be learned, compared with standard inverse optimal control methods.

Original languageEnglish (US)
Pages (from-to)1801-1815
Number of pages15
JournalIEEE Transactions on Robotics
Volume39
Issue number3
DOIs
StatePublished - Jun 2023

Keywords

  • Game-theoretic interactions
  • inverse reinforcement learning (IRL)
  • learning from demonstration
  • multi-agent systems

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Control and Systems Engineering
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Maximum-Entropy Multi-Agent Dynamic Games: Forward and Inverse Solutions'. Together they form a unique fingerprint.

Cite this