Categorical Data

Research output: Chapter in Book/Report/Conference proceedingChapter

Abstract

Categorical data refers to counts of events or individuals observed through some defined process and often allocated to subgroups, or categories, corresponding to levels of one or more attributes. This article reviews methods for interpreting collections of such counts, when they arise from apparently random environmental processes and may be treated as dependent variables relative to potentially explanatory factors or covariates. After introducing basic terminology including measures of relative frequency and association, we review the Poisson probability distribution. This is followed by the binomial, multinomial and hypergeometric distributions and products thereof, that result from conditioning upon sums of independent Poisson counts. These form the basis for modeling the random variation in observed categorical data. For modeling structural relationships, generalized linear models are first defined, and Poisson regression, logistic regression, and log-linear models are each considered within that framework. We then summarize several methods for analyzing the correlated counts that occur when observing a categorical dependent variable on the same observational units under several measurement conditions or at multiple observation times, or on multiple observational units within matched sets. These methods include weighted least-squares functional regression, conditional logistic regression, Cochran–Mantel–Haenszel tests, generalized linear mixed models, and analyses using generalized estimating equations. Finally, we briefly comment on Bayes and empirical Bayes methods, spatial modeling, exact methods, and add inevitably ephemeral comments on the status of software at the turn of the millenium.

Original languageEnglish (US)
Title of host publicationEncyclopedia of Environmetrics
PublisherWiley
Pages1-23
Number of pages23
ISBN (Electronic)9780470057339
ISBN (Print)9780471899976
DOIs
StatePublished - Jan 1 2006

Keywords

  • categorical data
  • Cochran–Mantel–Haenszel tests
  • generalized estimating equations
  • generalized linear mixed models
  • generalized linear models
  • log-linear models
  • logistic regression
  • poisson regression
  • random counts
  • spatial analysis of rates
  • weighted least-squares models

ASJC Scopus subject areas

  • General Mathematics

Fingerprint

Dive into the research topics of 'Categorical Data'. Together they form a unique fingerprint.

Cite this