TY - JOUR
T1 - Temporal phenotyping of medically complex children via PARAFAC2 tensor factorization
AU - Perros, Ioakeim
AU - Papalexakis, Evangelos E.
AU - Vuduc, Richard
AU - Searles, Elizabeth
AU - Sun, Jimeng
N1 - This work was supported by the National Science Foundation , award IIS-#1418511 and CCF-#1533768 , and National Institute of Health , United States award 1R01MD011682-01 and R56HL138415 , Children's Healthcare of Atlanta and UCB .
This work was supported by the National Science Foundation, award IIS-#1418511 and CCF-#1533768, and National Institute of Health, United States award 1R01MD011682-01 and R56HL138415, Children's Healthcare of Atlanta and UCB.
PY - 2019/5
Y1 - 2019/5
N2 - Objective: Our aim is to extract clinically-meaningful phenotypes from longitudinal electronic health records (EHRs) of medically-complex children. This is a fragile set of patients consuming a disproportionate amount of pediatric care resources but who often end up with sub-optimal clinical outcome. The rise in available electronic health records (EHRs) provide a rich data source that can be used to disentangle their complex clinical conditions into concise, clinically-meaningful groups of characteristics. We aim at identifying those phenotypes and their temporal evolution in a scalable, computational manner, which avoids the time-consuming manual chart review. Materials and methods: We analyze longitudinal EHRs from Children's Healthcare of Atlanta including 1045 medically complex patients with a total of 59,948 encounters over 2 years. We apply a tensor factorization method called PARAFAC2 to extract: (a) clinically-meaningful groups of features (b) concise patient representations indicating the presence of a phenotype for each patient, and (c) temporal signatures indicating the evolution of those phenotypes over time for each patient. Results: We identified four medically complex phenotypes, namely gastrointestinal disorders, oncological conditions, blood-related disorders, and neurological system disorders, which have distinct clinical characterizations among patients. We demonstrate the utility of patient representations produced by PARAFAC2, towards identifying groups of patients with significant survival variations. Finally, we showcase representative examples of the temporal phenotypic trends extracted for different patients. Discussion: Unsupervised temporal phenotyping is an important task since it minimizes the burden on behalf of clinical experts, by relegating their involvement in the output phenotypes’ validation. PARAFAC2 enjoys several compelling properties towards temporal computational phenotyping: (a) it is able to handle high-dimensional data and variable numbers of encounters across patients, (b) it has an intuitive interpretation and (c) it is free from ad-hoc parameter choices. Computational phenotypes, such as the ones computed by our approach, have multiple applications; we highlight three of them which are particularly useful for medically complex children: (1) integration into clinical decision support systems, (2) interpretable mortality prediction and 3) clinical trial recruitment. Conclusion: PARAFAC2 can be applied to unsupervised temporal phenotyping tasks where precise definitions of different phenotypes are absent, and lengths of patient records are varying.
AB - Objective: Our aim is to extract clinically-meaningful phenotypes from longitudinal electronic health records (EHRs) of medically-complex children. This is a fragile set of patients consuming a disproportionate amount of pediatric care resources but who often end up with sub-optimal clinical outcome. The rise in available electronic health records (EHRs) provide a rich data source that can be used to disentangle their complex clinical conditions into concise, clinically-meaningful groups of characteristics. We aim at identifying those phenotypes and their temporal evolution in a scalable, computational manner, which avoids the time-consuming manual chart review. Materials and methods: We analyze longitudinal EHRs from Children's Healthcare of Atlanta including 1045 medically complex patients with a total of 59,948 encounters over 2 years. We apply a tensor factorization method called PARAFAC2 to extract: (a) clinically-meaningful groups of features (b) concise patient representations indicating the presence of a phenotype for each patient, and (c) temporal signatures indicating the evolution of those phenotypes over time for each patient. Results: We identified four medically complex phenotypes, namely gastrointestinal disorders, oncological conditions, blood-related disorders, and neurological system disorders, which have distinct clinical characterizations among patients. We demonstrate the utility of patient representations produced by PARAFAC2, towards identifying groups of patients with significant survival variations. Finally, we showcase representative examples of the temporal phenotypic trends extracted for different patients. Discussion: Unsupervised temporal phenotyping is an important task since it minimizes the burden on behalf of clinical experts, by relegating their involvement in the output phenotypes’ validation. PARAFAC2 enjoys several compelling properties towards temporal computational phenotyping: (a) it is able to handle high-dimensional data and variable numbers of encounters across patients, (b) it has an intuitive interpretation and (c) it is free from ad-hoc parameter choices. Computational phenotypes, such as the ones computed by our approach, have multiple applications; we highlight three of them which are particularly useful for medically complex children: (1) integration into clinical decision support systems, (2) interpretable mortality prediction and 3) clinical trial recruitment. Conclusion: PARAFAC2 can be applied to unsupervised temporal phenotyping tasks where precise definitions of different phenotypes are absent, and lengths of patient records are varying.
KW - Computational phenotyping
KW - Temporal phenotyping
KW - Tensor analysis
UR - http://www.scopus.com/inward/record.url?scp=85063269910&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85063269910&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2019.103125
DO - 10.1016/j.jbi.2019.103125
M3 - Article
C2 - 30743070
AN - SCOPUS:85063269910
SN - 1532-0464
VL - 93
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
M1 - 103125
ER -