Model-based longitudinal clustering with varying cluster assignments

Research output: Contribution to journalArticlepeer-review


It is often of interest to perform clustering on longitudinal data, yet it is difficult to formulate an intuitive model for which estimation is computationally feasible. We propose a model-based clustering method for clustering objects that are observed over time. The proposed model can be viewed as an extension of the normal mixture model for clustering to longitudinal data. While existing models only account for clustering effects, we propose modeling the distribution of the observed values of each object as a blending of a cluster effect and an individual effect, hence also giving an estimate of how much the behavior of an object is determined by the cluster to which it belongs. Further, it is important to detect how explanatory variables affect the clustering. An advantage of our method is that it can handle multiple explanatory variables of any type through a linear modeling of the cluster transition probabilities. We implement the generalized EM algorithm using several recursive relationships to greatly decrease the computational cost. The accuracy of our estimation method is illustrated in a simulation study, and U.S. Congressional data is analyzed.

Original languageEnglish (US)
Pages (from-to)205-233
Number of pages29
JournalStatistica Sinica
Issue number1
StatePublished - Jan 2016


  • Cluster analysis
  • EM algorithm
  • Multinomial logistic regression
  • Normal mixture models
  • Time series

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Model-based longitudinal clustering with varying cluster assignments'. Together they form a unique fingerprint.

Cite this