TY - JOUR
T1 - Martingale Difference Correlation and Its Use in High-Dimensional Variable Screening
AU - Shao, Xiaofeng
AU - Zhang, Jingsi
N1 - Xiaofeng Shao is Associate Professor, Department of Statistics, University of Illinois at Urbana-Champaign, Champaign, IL (E-mail: [email protected]). Jingsi Zhang is Ph.D. student in the Department of Statistics, Northwestern University, Evanston, IL (E-mail: [email protected]). Both authors equally contribute to this article, and the authors are listed in the alphabetical order. Shao’s research was partially supported by NSF grants DMS08-04937 and DMS11-04545. The authors thank all the contributors of He, Wang and Hong (2013) for providing the R codes used in their article and for a clarification of their example 3.a in numerical simulations. Thanks also go to Professor Runze Li and Dr. Lukas Meier for providing the R codes used in the papers by Li, Zhong, and Zhu (2012), and Meier, van de Geer, and Bühlmann (2009), respectively. The authors are also grateful to the associate editor and three referees for their constructive comments that led to a substantial improvement of the article.
PY - 2014/9
Y1 - 2014/9
N2 - In this article, we propose a new metric, the so-called martingale difference correlation, to measure the departure of conditional mean independence between a scalar response variable V and a vector predictor variable U. Our metric is a natural extension of distance correlation proposed by Székely, Rizzo, and Bahirov, which is used to measure the dependence between V and U. The martingale difference correlation and its empirical counterpart inherit a number of desirable features of distance correlation and sample distance correlation, such as algebraic simplicity and elegant theoretical properties. We further use martingale difference correlation as a marginal utility to do high-dimensional variable screening to screen out variables that do not contribute to conditional mean of the response given the covariates. Further extension to conditional quantile screening is also described in detail and sure screening properties are rigorously justified. Both simulation results and real data illustrations demonstrate the effectiveness of martingale difference correlation-based screening procedures in comparison with the existing counterparts. Supplementary materials for this article are available online.
AB - In this article, we propose a new metric, the so-called martingale difference correlation, to measure the departure of conditional mean independence between a scalar response variable V and a vector predictor variable U. Our metric is a natural extension of distance correlation proposed by Székely, Rizzo, and Bahirov, which is used to measure the dependence between V and U. The martingale difference correlation and its empirical counterpart inherit a number of desirable features of distance correlation and sample distance correlation, such as algebraic simplicity and elegant theoretical properties. We further use martingale difference correlation as a marginal utility to do high-dimensional variable screening to screen out variables that do not contribute to conditional mean of the response given the covariates. Further extension to conditional quantile screening is also described in detail and sure screening properties are rigorously justified. Both simulation results and real data illustrations demonstrate the effectiveness of martingale difference correlation-based screening procedures in comparison with the existing counterparts. Supplementary materials for this article are available online.
KW - Conditional mean
KW - Feature screening
KW - High-dimensional inference
KW - Sure screening property
UR - http://www.scopus.com/inward/record.url?scp=84907486707&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84907486707&partnerID=8YFLogxK
U2 - 10.1080/01621459.2014.887012
DO - 10.1080/01621459.2014.887012
M3 - Article
AN - SCOPUS:84907486707
SN - 0162-1459
VL - 109
SP - 1302
EP - 1318
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 507
ER -