Estimating and Identifying Unspecified Correlation Structure for Longitudinal Data

Jianhua Hu, Peng Wang, Annie Qu

Research output: Contribution to journalArticlepeer-review


Identifying correlation structure is important to achieving estimation efficiency in analyzing longitudinal data, and is also crucial for drawing valid statistical inference for large-size clustered data. In this article, we propose a nonparametric method to estimate the correlation structure, which is applicable for discrete longitudinal data. We use eigenvector-based basis matrices to approximate the inverse of the empirical correlation matrix and determine the number of basis matrices via model selection. A penalized objective function based on the difference between the empirical and model approximation of the correlation matrices is adopted to select an informative structure for the correlation matrix. The eigenvector representation of the correlation estimation is capable of reducing the risk of model misspecification, and also provides useful information on the specific within-cluster correlation pattern of the data. We show that the proposed method possesses the oracle property and selects the true correlation structure consistently. The proposed method is illustrated through simulations and two data examples on air pollution and sonar signal studies.

Original languageEnglish (US)
Pages (from-to)455-476
Number of pages22
JournalJournal of Computational and Graphical Statistics
Issue number2
StatePublished - Apr 3 2015


  • Correlated data
  • Eigenvector decomposition
  • Oracle property
  • Quadratic inference function
  • SCAD penalty

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty
  • Discrete Mathematics and Combinatorics


Dive into the research topics of 'Estimating and Identifying Unspecified Correlation Structure for Longitudinal Data'. Together they form a unique fingerprint.

Cite this