Consistent model selection and data-driven smooth tests for longitudinal data in the estimating equations approach

Lan Wang, Annie Qu

Research output: Contribution to journalArticlepeer-review


Model selection for marginal regression analysis of longitudinal data is challenging owing to the presence of correlation and the difficulty of specifying the full likelihood, particularly for correlated categorical data. The paper introduces a novel Bayesian information criterion type model selection procedure based on the quadratic inference function, which does not require the full likelihood or quasi-likelihood. With probability approaching 1, the criterion selects the most parsimonious correct model. Although a working correlation matrix is assumed, there is no need to estimate the nuisance parameters in the working correlation matrix; moreover, the model selection procedure is robust against the misspecification of the working correlation matrix. The criterion proposed can also be used to construct a data-driven Neyman smooth test for checking the goodness of fit of a postulated model. This test is especially useful and often yields much higher power in situations where the classical directional test behaves poorly. The finite sample performance of the model selection and model checking procedures is demonstrated through Monte Carlo studies and analysis of a clinical trial data set.

Original languageEnglish (US)
Pages (from-to)177-190
Number of pages14
JournalJournal of the Royal Statistical Society. Series B: Statistical Methodology
Issue number1
StatePublished - Jan 2009


  • Bayes information criterion
  • Correlated data
  • Generalized estimating equations
  • Longitudinal data
  • Marginal model
  • Model checking
  • Model selection
  • Neyman smooth test
  • Quadratic inference function

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'Consistent model selection and data-driven smooth tests for longitudinal data in the estimating equations approach'. Together they form a unique fingerprint.

Cite this