Enhanced decision framework for two-player zero-sum Markov games with diverse opponent policies

  • Jin Zhu
  • , Xuan Wang
  • , Dullerud E. Geir

Research output: Contribution to journalArticlepeer-review

Abstract

This paper takes into account a general two-player zero-sum Markov game scenario in which our agent faces multi-type opponents with multiple policies. To enhance our agent’s return against opponent’s diverse policies, a novel Decision-making Framework based on Opponent Distinguishing and Policy Judgment (DF-ODPJ) is proposed. On the basis of the pre-trained Nash equilibrium strategies, DF-ODPJ can distinguish the opponent’s type by sampling from the interaction trajectory. Then a fast criterion is proposed to judge the opponent’s policy which is proven to minimize the misjudgment probability with optimal threshold calculated. According to the identification results, appropriate policies are generated to enhance the return. The proposed DF-ODPJ is more flexible since it is orthogonal to existing Nash equilibrium algorithms and single-agent reinforcement learning algorithms. The experimental results on grid world, video games, and UAV aerial combat environments illustrate the effectiveness of DF-ODPJ. The code is available at https://github.com/ChenXJ295/DF-ODPJ.

Original languageEnglish (US)
Article number449
JournalApplied Intelligence
Volume55
Issue number6
Early online dateFeb 14 2025
DOIs
StatePublished - Apr 2025

Keywords

  • Opponent distinguishing and policy judgment
  • Opponents with diverse policies
  • Reward enhancement
  • Two-player zero-sum Markov games

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Enhanced decision framework for two-player zero-sum Markov games with diverse opponent policies'. Together they form a unique fingerprint.

Cite this