### Abstract

This paper addresses model reduction for a Markov chain on a large state space. A simulation-based framework is introduced to perform state aggregation of the Markov chain based on observations of a single sample path. The Kullback-Leibler (K-L) divergence rate is employed as a metric to measure the distance between two stationary Markov chains. Model reduction with respect to this metric is cast as an infinite-horizon average cost optimal control problem. In this way an optimal policy corresponds to an optimal partition of the state space with respect to the K-L divergence rate. The optimal control problem is simplified in an approximate dynamic programming (ADP) framework: First, a relaxation of the policy space is performed, and based on this a parameterization of the set of optimal policies is introduced. This makes possible a stochastic approximation approach to compute the best policy within a given parameterized class. The algorithm can be implemented using a single sample path of the Markov chain. Convergence is established using the ODE method. Examples illustrate the theoretical results, and show remarkably low variance and fast convergence.

Original language | English (US) |
---|---|

Title of host publication | Proceedings of the 48th IEEE Conference on Decision and Control held jointly with 2009 28th Chinese Control Conference, CDC/CCC 2009 |

Pages | 4710-4716 |

Number of pages | 7 |

DOIs | |

State | Published - Dec 1 2009 |

Event | 48th IEEE Conference on Decision and Control held jointly with 2009 28th Chinese Control Conference, CDC/CCC 2009 - Shanghai, China Duration: Dec 15 2009 → Dec 18 2009 |

### Publication series

Name | Proceedings of the IEEE Conference on Decision and Control |
---|---|

ISSN (Print) | 0191-2216 |

### Other

Other | 48th IEEE Conference on Decision and Control held jointly with 2009 28th Chinese Control Conference, CDC/CCC 2009 |
---|---|

Country | China |

City | Shanghai |

Period | 12/15/09 → 12/18/09 |

### ASJC Scopus subject areas

- Control and Systems Engineering
- Modeling and Simulation
- Control and Optimization

## Fingerprint Dive into the research topics of 'A simulation-based method for aggregating Markov chains'. Together they form a unique fingerprint.

## Cite this

*Proceedings of the 48th IEEE Conference on Decision and Control held jointly with 2009 28th Chinese Control Conference, CDC/CCC 2009*(pp. 4710-4716). [5400533] (Proceedings of the IEEE Conference on Decision and Control). https://doi.org/10.1109/CDC.2009.5400533