TY - GEN
T1 - An Iterative Self-Learning Framework for Medical Domain Generalization
AU - Wu, Zhenbang
AU - Yao, Huaxiu
AU - Liebovitz, David M.
AU - Sun, Jimeng
N1 - This work was supported by NSF award SCH-2205289, SCH-2014438, and IIS-2034479. This project has been funded by the Jump ARCHES endowment through the Health Care Engineering Systems Center.
PY - 2023
Y1 - 2023
N2 - Deep learning models have been widely used to assist doctors with clinical decision-making. However, these models often encounter a significant performance drop when applied to data that differs from the distribution they were trained on. This challenge is known as the domain shift problem. Existing domain generalization algorithms attempt to address this problem by assuming the availability of domain IDs and training a single model to handle all domains. However, in healthcare settings, patients can be classified into numerous latent domains, where the actual domain categorizations are unknown. Furthermore, each patient domain exhibits distinct clinical characteristics, making it sub-optimal to train a single model for all domains. To overcome these limitations, we propose SLDG, a self-learning framework that iteratively discovers decoupled domains and trains personalized classifiers for each decoupled domain. We evaluate the generalizability of SLDG across spatial and temporal data distribution shifts on two real-world public EHR datasets: eICU and MIMIC-IV. Our results show that SLDG achieves up to 11% improvement in the AUPRC score over the best baseline.
AB - Deep learning models have been widely used to assist doctors with clinical decision-making. However, these models often encounter a significant performance drop when applied to data that differs from the distribution they were trained on. This challenge is known as the domain shift problem. Existing domain generalization algorithms attempt to address this problem by assuming the availability of domain IDs and training a single model to handle all domains. However, in healthcare settings, patients can be classified into numerous latent domains, where the actual domain categorizations are unknown. Furthermore, each patient domain exhibits distinct clinical characteristics, making it sub-optimal to train a single model for all domains. To overcome these limitations, we propose SLDG, a self-learning framework that iteratively discovers decoupled domains and trains personalized classifiers for each decoupled domain. We evaluate the generalizability of SLDG across spatial and temporal data distribution shifts on two real-world public EHR datasets: eICU and MIMIC-IV. Our results show that SLDG achieves up to 11% improvement in the AUPRC score over the best baseline.
UR - http://www.scopus.com/inward/record.url?scp=85186366196&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85186366196&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85186366196
T3 - Advances in Neural Information Processing Systems
BT - Advances in Neural Information Processing Systems 36 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
A2 - Oh, A.
A2 - Neumann, T.
A2 - Globerson, A.
A2 - Saenko, K.
A2 - Hardt, M.
A2 - Levine, S.
PB - Neural information processing systems foundation
T2 - 37th Conference on Neural Information Processing Systems, NeurIPS 2023
Y2 - 10 December 2023 through 16 December 2023
ER -