Signal classification for the integrative analysis of multiple sequences of large-scale multiple tests

Dongdong Xiang, Sihai Dave Zhao, T. Tony Cai

Research output: Contribution to journalArticlepeer-review

Abstract

The integrative analysis of multiple data sets is becoming increasingly important in many fields of research. When the same features are studied in several independent experiments, it can often be useful to analyse jointly the multiple sequences of multiple tests that result. It is frequently necessary to classify each feature into one of several categories, depending on the null and non-null configuration of its corresponding test statistics. The paper studies this signal classification problem, motivated by a range of applications in large-scale genomics. Two new types of misclassification rate are introduced, and two oracle procedures are developed to control each type while also achieving the largest expected number of correct classifications. Corresponding data-driven procedures are also proposed, proved to be asymptotically valid and optimal under certain conditions and shown in numerical experiments to be nearly as powerful as the oracle procedures. In an application to psychiatric genetics, the procedures proposed are used to discover genetic variants that may affect both bipolar disorder and schizophrenia, as well as variants that may help to distinguish between these conditions.

Original languageEnglish (US)
Pages (from-to)707-734
Number of pages28
JournalJournal of the Royal Statistical Society. Series B: Statistical Methodology
Volume81
Issue number4
DOIs
StatePublished - 2019

Keywords

  • Integrative analysis
  • Multiple testing
  • Set-specific marginal false discovery rate
  • Signal classification
  • Total marginal false discovery rate

ASJC Scopus subject areas

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Signal classification for the integrative analysis of multiple sequences of large-scale multiple tests'. Together they form a unique fingerprint.

Cite this