Boosting the actor with dual critic

Bo Dai, Albert Shaw, Niao He, Lihong Li, Le Song

Research output: Contribution to conferencePaper

Abstract

This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves state-of-the-art performance across several benchmarks.

Original languageEnglish (US)
StatePublished - Jan 1 2018
Event6th International Conference on Learning Representations, ICLR 2018 - Vancouver, Canada
Duration: Apr 30 2018May 3 2018

Conference

Conference6th International Conference on Learning Representations, ICLR 2018
CountryCanada
CityVancouver
Period4/30/185/3/18

Fingerprint

critic
Concretes
learning
performance

ASJC Scopus subject areas

  • Language and Linguistics
  • Education
  • Computer Science Applications
  • Linguistics and Language

Cite this

Dai, B., Shaw, A., He, N., Li, L., & Song, L. (2018). Boosting the actor with dual critic. Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada.

Boosting the actor with dual critic. / Dai, Bo; Shaw, Albert; He, Niao; Li, Lihong; Song, Le.

2018. Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada.

Research output: Contribution to conferencePaper

Dai, B, Shaw, A, He, N, Li, L & Song, L 2018, 'Boosting the actor with dual critic' Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada, 4/30/18 - 5/3/18, .
Dai B, Shaw A, He N, Li L, Song L. Boosting the actor with dual critic. 2018. Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada.
Dai, Bo ; Shaw, Albert ; He, Niao ; Li, Lihong ; Song, Le. / Boosting the actor with dual critic. Paper presented at 6th International Conference on Learning Representations, ICLR 2018, Vancouver, Canada.
@conference{317d916ef60a46f8ab25358c3556cf7d,
title = "Boosting the actor with dual critic",
abstract = "This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves state-of-the-art performance across several benchmarks.",
author = "Bo Dai and Albert Shaw and Niao He and Lihong Li and Le Song",
year = "2018",
month = "1",
day = "1",
language = "English (US)",
note = "6th International Conference on Learning Representations, ICLR 2018 ; Conference date: 30-04-2018 Through 03-05-2018",

}

TY - CONF

T1 - Boosting the actor with dual critic

AU - Dai, Bo

AU - Shaw, Albert

AU - He, Niao

AU - Li, Lihong

AU - Song, Le

PY - 2018/1/1

Y1 - 2018/1/1

N2 - This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves state-of-the-art performance across several benchmarks.

AB - This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves state-of-the-art performance across several benchmarks.

UR - http://www.scopus.com/inward/record.url?scp=85062075338&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062075338&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85062075338

ER -