On the choice of number of superstates in the aggregation of Markov chains

Amber Srivastava, Raj K. Velicheti, Srinivasa M. Salapaka

Research output: Contribution to journalArticlepeer-review

Abstract

Many studies involving large Markov chains require determining a smaller representative (aggregated) chain. Each superstate in the representative chain represents a group of related states in the original Markov chain. Typically, the choice of number of superstates in the aggregated chain is ambiguous, and based on the limited prior know-how. This paper presents a structured methodology of determining the best candidate for the number of superstates. It achieves this by comparing aggregated chains of different sizes. To facilitate this comparison a new quantity called heterogeneity of a superstate is developed, and subsequently it is used to establish the notion of marginal return of an aggregated chain. In particular, the notion of marginal return captures the decrease in the heterogeneity upon a unit increase in the number of superstates in the aggregated chain. Maximum Entropy Principle (MEP), from statistical mechanics, justifies the developed notion of marginal return, as well as the quantification of heterogeneity. Through simulations on synthetic Markov chains, where the number of superstates are known a priori, it is observed that the aggregated chain with the largest marginal return identifies this number. In case of Markov chains that model real-life scenarios it is shown that the aggregated model with the largest marginal return identifies an inherent structure unique to the scenario being modelled; thus, substantiating the efficacy of the proposed methodology.

Original languageEnglish (US)
Pages (from-to)181-188
Number of pages8
JournalPattern Recognition Letters
Volume159
DOIs
StatePublished - Jul 2022

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'On the choice of number of superstates in the aggregation of Markov chains'. Together they form a unique fingerprint.

Cite this