TY - JOUR
T1 - Localizing differentially evolving covariance structures via scan statistics
AU - Mehta, Ronak
AU - Kim, Hyunwoo J.
AU - Wang, Shulei
AU - Johnson, Sterling C.
AU - Yuan, Ming
AU - Singh, Vikas
N1 - Funding Information:
This research was supported in part by NIH grants R01 AG040396, AG021155, EB022883 and NSF grants DMS 1265202 and CAREER award 1252725. The authors were also supported by the UW Center for Predictive Computational Phenotyping (via BD2K award AI117924) and the Wisconsin Alzheimer's Disease Research Center (AG033514).
Funding Information:
Received March 2, 2018, and, in revised form, September 23, 2018. 2010 Mathematics Subject Classification. Primary 53B20, 53B21, 62F03, 62P10; Secondary 62H30, 62H10. This research was supported in part by NIH grants R01 AG040396, AG021155, EB022883 and NSF grants DMS 1265202 and CAREER award 1252725. The authors were also supported by the UW Center for Predictive Computational Phenotyping (via BD2K award AI117924) and the Wisconsin Alzheimer’s Disease Research Center (AG033514). The first author was supported by a fellowship via training grant award T32LM012413. The second author’s work on this project was performed at UW-Madison in 2017, before joining Amazon. The fifth author was supported by NSF grant DMS-1803450. Email address: ronakrm@cs.wisc.edu Email address: hwkim@cs.wisc.edu Email address: shulei@stat.wisc.edu Email address: scj@medicine.wisc.edu Email address: ming.yuan@columbia.edu Email address: vsingh@biostat.wisc.edu
Publisher Copyright:
© 2019 Brown University.
PY - 2019
Y1 - 2019
N2 - Recent results in coupled or temporal graphical models offer schemes for estimating the relationship structure between features when the data come from related (but distinct) longitudinal sources. A novel application of these ideas is for analyzing group-level differences, i.e., in identifying if trends of estimated objects (e.g., covariance or precision matrices) are different across disparate conditions (e.g., gender or disease). Often, poor effect sizes make detecting the differential signal over the full set of features difficult: for example, dependencies between only a subset of features may manifest differently across groups. In this work, we first give a parametric model for estimating trends in the space of SPD matrices as a function of one or more covariates. We then generalize scan statistics to graph structures, to search over distinct subsets of features (graph partitions) whose temporal dependency structure may show statistically significant groupwise differences. We theoretically analyze the Family Wise Error Rate (FWER) and bounds on Type 1 and Type 2 error. Evaluating on U.S. census data, we identify groups of states with cultural and legal overlap related to baby name trends and drug usage. On a cohort of individuals with risk factors for Alzheimer's disease (but otherwise cognitively healthy), we find scientifically interesting group differences where the default analysis, i.e., models estimated on the full graph, do not survive reasonable significance thresholds.
AB - Recent results in coupled or temporal graphical models offer schemes for estimating the relationship structure between features when the data come from related (but distinct) longitudinal sources. A novel application of these ideas is for analyzing group-level differences, i.e., in identifying if trends of estimated objects (e.g., covariance or precision matrices) are different across disparate conditions (e.g., gender or disease). Often, poor effect sizes make detecting the differential signal over the full set of features difficult: for example, dependencies between only a subset of features may manifest differently across groups. In this work, we first give a parametric model for estimating trends in the space of SPD matrices as a function of one or more covariates. We then generalize scan statistics to graph structures, to search over distinct subsets of features (graph partitions) whose temporal dependency structure may show statistically significant groupwise differences. We theoretically analyze the Family Wise Error Rate (FWER) and bounds on Type 1 and Type 2 error. Evaluating on U.S. census data, we identify groups of states with cultural and legal overlap related to baby name trends and drug usage. On a cohort of individuals with risk factors for Alzheimer's disease (but otherwise cognitively healthy), we find scientifically interesting group differences where the default analysis, i.e., models estimated on the full graph, do not survive reasonable significance thresholds.
UR - http://www.scopus.com/inward/record.url?scp=85063666529&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85063666529&partnerID=8YFLogxK
U2 - 10.1090/qam/1522
DO - 10.1090/qam/1522
M3 - Article
AN - SCOPUS:85063666529
SN - 0033-569X
VL - 77
SP - 357
EP - 398
JO - Quarterly of Applied Mathematics
JF - Quarterly of Applied Mathematics
IS - 2
ER -