Abstract
Categorical data refers to counts of events or individuals observed through some defined process and often allocated to subgroups, or categories, corresponding to levels of one or more attributes. This article reviews methods for interpreting collections of such counts, when they arise from apparently random environmental processes and may be treated as dependent variables relative to potentially explanatory factors or covariates. After introducing basic terminology including measures of relative frequency and association, we review the Poisson probability distribution. This is followed by the binomial, multinomial and hypergeometric distributions and products thereof, that result from conditioning upon sums of independent Poisson counts. These form the basis for modeling the random variation in observed categorical data. For modeling structural relationships, generalized linear models are first defined, and Poisson regression, logistic regression, and log-linear models are each considered within that framework. We then summarize several methods for analyzing the correlated counts that occur when observing a categorical dependent variable on the same observational units under several measurement conditions or at multiple observation times, or on multiple observational units within matched sets. These methods include weighted least-squares functional regression, conditional logistic regression, Cochran–Mantel–Haenszel tests, generalized linear mixed models, and analyses using generalized estimating equations. Finally, we briefly comment on Bayes and empirical Bayes methods, spatial modeling, exact methods, and add inevitably ephemeral comments on the status of software at the turn of the millenium.
| Original language | English (US) |
|---|---|
| Title of host publication | Encyclopedia of Environmetrics |
| Publisher | Wiley |
| Pages | 1-23 |
| Number of pages | 23 |
| ISBN (Electronic) | 9780470057339 |
| ISBN (Print) | 9780471899976 |
| DOIs | |
| State | Published - Jan 1 2006 |
Keywords
- categorical data
- Cochran–Mantel–Haenszel tests
- generalized estimating equations
- generalized linear mixed models
- generalized linear models
- log-linear models
- logistic regression
- poisson regression
- random counts
- spatial analysis of rates
- weighted least-squares models
ASJC Scopus subject areas
- General Mathematics
Fingerprint
Dive into the research topics of 'Categorical Data'. Together they form a unique fingerprint.Cite this
- APA
- Standard
- Harvard
- Vancouver
- Author
- BIBTEX
- RIS