TY - JOUR
T1 - Privacy-preserving patient similarity learning in a federated environment
T2 - Development and analysis
AU - Lee, Junghye
AU - Sun, Jimeng
AU - Wang, Fei
AU - Wang, Shuang
AU - Jun, Chi Hyuck
AU - Jiang, Xiaoqian
N1 - Publisher Copyright:
© Junghye Lee, Jimeng Sun, Fei Wang, Shuang Wang, Chi-Hyuck Jun, Xiaoqian Jiang.
PY - 2018/4
Y1 - 2018/4
N2 - Background: There is an urgent need for the development of global analytic frameworks that can perform analyses in a privacy-preserving federated environment across multiple institutions without privacy leakage. A few studies on the topic of federated medical analysis have been conducted recently with the focus on several algorithms. However, none of them have solved similar patient matching, which is useful for applications such as cohort construction for cross-institution observational studies, disease surveillance, and clinical trials recruitment. Objective: The aim of this study was to present a privacy-preserving platform in a federated setting for patient similarity learning across institutions. Without sharing patient-level information, our model can find similar patients from one hospital to another. Methods: We proposed a federated patient hashing framework and developed a novel algorithm to learn context-specific hash codes to represent patients across institutions. The similarities between patients can be efficiently computed using the resulting hash codes of corresponding patients. To avoid security attack from reverse engineering on the model, we applied homomorphic encryption to patient similarity search in a federated setting. Results: We used sequential medical events extracted from the Multiparameter Intelligent Monitoring in Intensive Care-III database to evaluate the proposed algorithm in predicting the incidence of five diseases independently. Our algorithm achieved averaged area under the curves of 0.9154 and 0.8012 with balanced and imbalanced data, respectively, in ?-nearest neighbor with ?=3. We also confirmed privacy preservation in similarity search by using homomorphic encryption. Conclusions: The proposed algorithm can help search similar patients across institutions effectively to support federated data analysis in a privacy-preserving manner.
AB - Background: There is an urgent need for the development of global analytic frameworks that can perform analyses in a privacy-preserving federated environment across multiple institutions without privacy leakage. A few studies on the topic of federated medical analysis have been conducted recently with the focus on several algorithms. However, none of them have solved similar patient matching, which is useful for applications such as cohort construction for cross-institution observational studies, disease surveillance, and clinical trials recruitment. Objective: The aim of this study was to present a privacy-preserving platform in a federated setting for patient similarity learning across institutions. Without sharing patient-level information, our model can find similar patients from one hospital to another. Methods: We proposed a federated patient hashing framework and developed a novel algorithm to learn context-specific hash codes to represent patients across institutions. The similarities between patients can be efficiently computed using the resulting hash codes of corresponding patients. To avoid security attack from reverse engineering on the model, we applied homomorphic encryption to patient similarity search in a federated setting. Results: We used sequential medical events extracted from the Multiparameter Intelligent Monitoring in Intensive Care-III database to evaluate the proposed algorithm in predicting the incidence of five diseases independently. Our algorithm achieved averaged area under the curves of 0.9154 and 0.8012 with balanced and imbalanced data, respectively, in ?-nearest neighbor with ?=3. We also confirmed privacy preservation in similarity search by using homomorphic encryption. Conclusions: The proposed algorithm can help search similar patients across institutions effectively to support federated data analysis in a privacy-preserving manner.
KW - Federated environment
KW - Hashing
KW - Homomorphic encryption
KW - Privacy
KW - Similarity learning
UR - http://www.scopus.com/inward/record.url?scp=85047723755&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85047723755&partnerID=8YFLogxK
U2 - 10.2196/medinform.7744
DO - 10.2196/medinform.7744
M3 - Article
C2 - 29653917
AN - SCOPUS:85047723755
SN - 1438-8871
VL - 20
JO - Journal of Medical Internet Research
JF - Journal of Medical Internet Research
IS - 4
M1 - e20
ER -