Stochastic variance reduction for deep Q-iearning

Wei Ye Zhao, Jian Peng

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Recent advances in deep reinforcement learning have achieved human-level performance on a variety of real-world applications. However, the current algorithms still suffer from poor gradient estimation with excessive variance, resulting in unstable training and poor sample efficiency. In our paper, we proposed an innovative optimization strategy by utilizing stochastic variance reduced gradient (SVRG) techniques. With extensive experiments on Atari domain, our method outperforms the deep q-learning baselines on 18 out of 20 games.

Original languageEnglish (US)
Title of host publication18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Pages2318-2320
Number of pages3
ISBN (Electronic)9781510892002
StatePublished - 2019
Event18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019 - Montreal, Canada
Duration: May 13 2019May 17 2019

Publication series

NameProceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS
Volume4
ISSN (Print)1548-8403
ISSN (Electronic)1558-2914

Conference

Conference18th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2019
Country/TerritoryCanada
CityMontreal
Period5/13/195/17/19

Keywords

  • Deep q-learning
  • Gradient variance
  • Stochastic variance reduction

ASJC Scopus subject areas

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering

Fingerprint

Dive into the research topics of 'Stochastic variance reduction for deep Q-iearning'. Together they form a unique fingerprint.

Cite this