A simulation-based method for aggregating Markov chains

Kun Deng, Prashant G. Mehta, Sean P. Meyn

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper addresses model reduction for a Markov chain on a large state space. A simulation-based framework is introduced to perform state aggregation of the Markov chain based on observations of a single sample path. The Kullback-Leibler (K-L) divergence rate is employed as a metric to measure the distance between two stationary Markov chains. Model reduction with respect to this metric is cast as an infinite-horizon average cost optimal control problem. In this way an optimal policy corresponds to an optimal partition of the state space with respect to the K-L divergence rate. The optimal control problem is simplified in an approximate dynamic programming (ADP) framework: First, a relaxation of the policy space is performed, and based on this a parameterization of the set of optimal policies is introduced. This makes possible a stochastic approximation approach to compute the best policy within a given parameterized class. The algorithm can be implemented using a single sample path of the Markov chain. Convergence is established using the ODE method. Examples illustrate the theoretical results, and show remarkably low variance and fast convergence.

Original languageEnglish (US)
Title of host publicationProceedings of the 48th IEEE Conference on Decision and Control held jointly with 2009 28th Chinese Control Conference, CDC/CCC 2009
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4710-4716
Number of pages7
ISBN (Print)9781424438716
DOIs
StatePublished - 2009
Event48th IEEE Conference on Decision and Control held jointly with 2009 28th Chinese Control Conference, CDC/CCC 2009 - Shanghai, China
Duration: Dec 15 2009Dec 18 2009

Publication series

NameProceedings of the IEEE Conference on Decision and Control
ISSN (Print)0743-1546
ISSN (Electronic)2576-2370

Other

Other48th IEEE Conference on Decision and Control held jointly with 2009 28th Chinese Control Conference, CDC/CCC 2009
Country/TerritoryChina
CityShanghai
Period12/15/0912/18/09

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Modeling and Simulation
  • Control and Optimization

Fingerprint

Dive into the research topics of 'A simulation-based method for aggregating Markov chains'. Together they form a unique fingerprint.

Cite this