TY - GEN
T1 - Integrating distance metrics learned from multiple experts and its application in patient similarity assessment
AU - Wang, Fei
AU - Sun, Jimeng
AU - Ebadollahi, Shahram
PY - 2011/12/1
Y1 - 2011/12/1
N2 - Patient similarity assessment is an important task in the context of patient cohort identification for comparative effectiveness studies and clinical decision support applications. The goal is to derive clinically meaningful distance metric to measure the similarity between patients represented by their key clinical indicators. It is desirable to learn the distance metric based on experts' knowledge of clinical similarity among subjects. However, often different physicians have different understandings of patient similarity based on the specifics of the cases. The distance metric learned for each individual physician often leads to a limited view of the true underlying distance metric. The key challenge will be how to integrate the individual distance metrics obtained for a group of physicians into a globally consistent unified metric. In this paper, we propose the Composite Distance Integration (Comdi) approach. In this approach we first construct discriminative neighborhoods from each individual metrics, then we combine them into a single optimal distance metric. We formulate Comdi as a quadratic optimization problem and propose an efficient alternating strategy to find the optimal solution. Besides learning a globally consistent metric, Comdi provides an elegant way to share knowledge across multiple experts (physicians) without sharing the underlying data, which enables the privacy preserving collaboration. Our experiments on several benchmark data sets show approximately 10% improvement in classification accuracy over baseline. These results show that Comdi is an effective and general metric learning approach. An application of our approach to real patient data has also been presented in the results.
AB - Patient similarity assessment is an important task in the context of patient cohort identification for comparative effectiveness studies and clinical decision support applications. The goal is to derive clinically meaningful distance metric to measure the similarity between patients represented by their key clinical indicators. It is desirable to learn the distance metric based on experts' knowledge of clinical similarity among subjects. However, often different physicians have different understandings of patient similarity based on the specifics of the cases. The distance metric learned for each individual physician often leads to a limited view of the true underlying distance metric. The key challenge will be how to integrate the individual distance metrics obtained for a group of physicians into a globally consistent unified metric. In this paper, we propose the Composite Distance Integration (Comdi) approach. In this approach we first construct discriminative neighborhoods from each individual metrics, then we combine them into a single optimal distance metric. We formulate Comdi as a quadratic optimization problem and propose an efficient alternating strategy to find the optimal solution. Besides learning a globally consistent metric, Comdi provides an elegant way to share knowledge across multiple experts (physicians) without sharing the underlying data, which enables the privacy preserving collaboration. Our experiments on several benchmark data sets show approximately 10% improvement in classification accuracy over baseline. These results show that Comdi is an effective and general metric learning approach. An application of our approach to real patient data has also been presented in the results.
UR - http://www.scopus.com/inward/record.url?scp=84867231327&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867231327&partnerID=8YFLogxK
U2 - 10.1137/1.9781611972818.6
DO - 10.1137/1.9781611972818.6
M3 - Conference contribution
AN - SCOPUS:84867231327
SN - 9780898719925
T3 - Proceedings of the 11th SIAM International Conference on Data Mining, SDM 2011
SP - 59
EP - 70
BT - Proceedings of the 11th SIAM International Conference on Data Mining, SDM 2011
PB - Society for Industrial and Applied Mathematics Publications
T2 - 11th SIAM International Conference on Data Mining, SDM 2011
Y2 - 28 April 2011 through 30 April 2011
ER -