Boosting the actor with dual critic

Bo Dai, Albert Shaw, Niao He, Lihong Li, Le Song

Research output: Contribution to conferencePaper

Abstract

This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves state-of-the-art performance across several benchmarks.

Original languageEnglish (US)
StatePublished - Jan 1 2018
Event6th International Conference on Learning Representations, ICLR 2018 - Vancouver, Canada
Duration: Apr 30 2018May 3 2018

Conference

Conference6th International Conference on Learning Representations, ICLR 2018
CountryCanada
CityVancouver
Period4/30/185/3/18

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Computer Science Applications
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Boosting the actor with dual critic'. Together they form a unique fingerprint.

  • Cite this

    Dai, B., Shaw, A., He, N., Li, L., & Song, L. (2018). Boosting the actor with dual critic. Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada.