Score test variable screening

Research output: Contribution to journalArticle

Abstract

Variable screening has emerged as a crucial first step in the analysis of high-throughput data, but existing procedures can be computationally cumbersome, difficult to justify theoretically, or inapplicable to certain types of analyses. Motivated by a high-dimensional censored quantile regression problem in multiple myeloma genomics, this article makes three contributions. First, we establish a score test-based screening framework, which is widely applicable, extremely computationally efficient, and relatively simple to justify. Secondly, we propose a resampling-based procedure for selecting the number of variables to retain after screening according to the principle of reproducibility. Finally, we propose a new iterative score test screening method which is closely related to sparse regression. In simulations we apply our methods to four different regression models and show that they can outperform existing procedures. We also apply score test screening to an analysis of gene expression data from multiple myeloma patients using a censored quantile regression model to identify high-risk genes.

Original languageEnglish (US)
Pages (from-to)862-871
Number of pages10
JournalBiometrics
Volume70
Issue number4
DOIs
StatePublished - Dec 1 2014

Keywords

  • Feature selection
  • High-dimensional data
  • Projected subgradient method
  • Score test
  • Variable screening

ASJC Scopus subject areas

  • Statistics and Probability
  • Biochemistry, Genetics and Molecular Biology(all)
  • Immunology and Microbiology(all)
  • Agricultural and Biological Sciences(all)
  • Applied Mathematics

Fingerprint Dive into the research topics of 'Score test variable screening'. Together they form a unique fingerprint.

  • Cite this