Abstract
Purpose: This study examined potential sources of selection and information biases when using residence history information from a commercial database to construct residential histories for cancer research. Methods: We searched the LexisNexis database for residence data on 3473 adults diagnosed with cancers of the prostate, colon/rectum, and female breast in a single health-care system between 2005 and 2016 using the name and address at diagnosis and the birth date. Residential histories were generated from the results using open-source statistical programs from the National Cancer Institute. Multivariable regression models analyzed the associations of the search results with demographic characteristics and all-cause mortality. Results: Racial/ethnic minorities were less likely to match to vendor residence data compared with non-Hispanic whites (odd ratios [95% confidence intervals (CIs)] for non-Hispanic blacks, Hispanics, and Asian/Pacific Islander were 1.66 [1.30, 2.12], 2.92 [2.18, 3.90], and 4.53 [2.72, 7.55], respectively). Being non-Hispanic black was negatively associated with years of residential history (vs. non-Hispanic whites, β coefficient [95% CI] = −2.57 [−3.40, −1.73]). Not matching to residence data was associated with an increased 5-year odds of death from any cause (vs. matched subjects, odd ratios [95% CI] = 5.92 [4.29, 8.50]). Conclusions: Differential ascertainment of residence history by race/ethnicity and association of ascertainment with prognosis are potential sources of selection and information biases when using residence data from a commercial database.
Original language | English (US) |
---|---|
Pages (from-to) | 35-40.e1 |
Journal | Annals of Epidemiology |
Volume | 51 |
DOIs | |
State | Published - Nov 2020 |
Externally published | Yes |
Keywords
- Cancer
- Database
- Demographic factors
- Information bias
- Residence characteristics
- Sampling bias
ASJC Scopus subject areas
- Epidemiology