TY - JOUR
T1 - Estimation and inference for the indirect effect in high-dimensional linear mediation models
AU - Zhou, Ruixuan Rachel
AU - Wang, Liewei
AU - Zhao, Sihai Dave
N1 - Funding Information:
The authors thank the reviewers and the associate editor for their extremely useful comments, as well as Casey Hanson for his help with data processing. This work was funded in part by the Mayo Clinic-UIUC Alliance and by the trans-NIH Big Data to Knowledge initiative. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Zhao was funded in part by the National Science Foundation and is also affiliated with the Carl R. Woese Institute for Genomic Biology at the University of Illinois.
Publisher Copyright:
© 2020 Biometrika Trust.
PY - 2020/9/1
Y1 - 2020/9/1
N2 - Mediation analysis is difficult when the number of potential mediators is larger than the sample size. In this paper we propose new inference procedures for the indirect effect in the presence of high-dimensional mediators for linear mediation models. We develop methods for both incomplete mediation, where a direct effect may exist, and complete mediation, where the direct effect is known to be absent. We prove consistency and asymptotic normality of our indirect effect estimators. Under complete mediation, where the indirect effect is equivalent to the total effect, we further prove that our approach gives a more powerful test compared to directly testing for the total effect. We confirm our theoretical results in simulations, as well as in an integrative analysis of gene expression and genotype data from a pharmacogenomic study of drug response. We present a novel analysis of gene sets to understand the molecular mechanisms of drug response, and also identify a genome-wide significant noncoding genetic variant that cannot be detected using standard analysis methods.
AB - Mediation analysis is difficult when the number of potential mediators is larger than the sample size. In this paper we propose new inference procedures for the indirect effect in the presence of high-dimensional mediators for linear mediation models. We develop methods for both incomplete mediation, where a direct effect may exist, and complete mediation, where the direct effect is known to be absent. We prove consistency and asymptotic normality of our indirect effect estimators. Under complete mediation, where the indirect effect is equivalent to the total effect, we further prove that our approach gives a more powerful test compared to directly testing for the total effect. We confirm our theoretical results in simulations, as well as in an integrative analysis of gene expression and genotype data from a pharmacogenomic study of drug response. We present a novel analysis of gene sets to understand the molecular mechanisms of drug response, and also identify a genome-wide significant noncoding genetic variant that cannot be detected using standard analysis methods.
KW - High-dimensional inference
KW - Integrative genomics
KW - Mediation analysis
UR - http://www.scopus.com/inward/record.url?scp=85091315643&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85091315643&partnerID=8YFLogxK
U2 - 10.1093/biomet/asaa016
DO - 10.1093/biomet/asaa016
M3 - Article
C2 - 32831353
AN - SCOPUS:85091315643
SN - 0006-3444
VL - 107
SP - 573
EP - 589
JO - Biometrika
JF - Biometrika
IS - 3
ER -