Feature Selection for Neuro-Dynamic Programming

Dayu Huang, W. Chen, P. Mehta, S. Meyn, A. Surana

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Neuro-dynamic programming encompasses techniques from both reinforcement learning and approximate dynamic programming. Feature selection refers to the choice of basis that defines the function class that is required in the application of these techniques. This chapter reviews two popular approaches to neuro-dynamic programming, TD- and Q-Learning. The main goal of the chapter is to demonstrate how insight from idealized models can be used as a guide for feature selection for these algorithms. Several approaches are surveyed, including fluid and diffusion models, and the application of idealized models arising from mean-field game approximations. The theory is illustrated with several examples.

Original languageEnglish (US)
Title of host publicationReinforcement Learning and Approximate Dynamic Programming for Feedback Control
PublisherJohn Wiley & Sons, Ltd.
Pages535-559
Number of pages25
ISBN (Print)9781118104200
DOIs
StatePublished - Feb 7 2013

Keywords

  • Feature selection for neuro DP
  • Neuro DP, RL and DP
  • Neuro-dynamic, TD-/Q-Learning for MDPs
  • Optimal for stochastic/deterministic, SARSA
  • Parameterized RL, via LP approaches

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Feature Selection for Neuro-Dynamic Programming'. Together they form a unique fingerprint.

Cite this