TY - JOUR
T1 - Leveraging medical thesauri and physician feedback for improving medical literature retrieval for case queries
AU - Sondhi, Parikshit
AU - Sun, Jimeng
AU - Zhai, Cheng Xiang
AU - Sorrentino, Robert
AU - Kohn, Martin S.
PY - 2012/9
Y1 - 2012/9
N2 - Objective This paper presents a study of methods for medical literature retrieval for case queries, in which the goal is to retrieve literature articles similar to a given patient case. In particular, it focuses on analyzing the performance of state-of-the-art general retrieval methods and improving them by the use of medical thesauri and physician feedback. Materials and Methods The KullbackeLeibler divergence retrieval model with Dirichlet smoothing is used as the state-of-the-art general retrieval method. Pseudorelevance feedback and term weighing methods are proposed by leveraging MeSH and UMLS thesauri. Evaluation is performed on a test collection recently created for the ImageCLEF medical case retrieval challenge. Results Experimental results show that a well-tuned state-of-the-art general retrieval model achieves a mean average precision of 0.2754, but the performance can be improved by over 40% to 0.3980, through the proposed methods. Discussion The results over the ImageCLEF test collection, which is currently the best collection available for the task, are encouraging. There are, however, limitations due to small evaluation set size. The analysis shows that further refinement of the methods is necessary before they can be really useful in a clinical setting. Conclusion Medical case-based literature retrieval is a critical search application that presents a number of unique challenges. This analysis shows that the state-ofthe- art general retrieval models are reasonably good for the task, but the performance can be significantly improved by developing new task-specific retrieval models that incorporate medical thesauri and physician feedback.
AB - Objective This paper presents a study of methods for medical literature retrieval for case queries, in which the goal is to retrieve literature articles similar to a given patient case. In particular, it focuses on analyzing the performance of state-of-the-art general retrieval methods and improving them by the use of medical thesauri and physician feedback. Materials and Methods The KullbackeLeibler divergence retrieval model with Dirichlet smoothing is used as the state-of-the-art general retrieval method. Pseudorelevance feedback and term weighing methods are proposed by leveraging MeSH and UMLS thesauri. Evaluation is performed on a test collection recently created for the ImageCLEF medical case retrieval challenge. Results Experimental results show that a well-tuned state-of-the-art general retrieval model achieves a mean average precision of 0.2754, but the performance can be improved by over 40% to 0.3980, through the proposed methods. Discussion The results over the ImageCLEF test collection, which is currently the best collection available for the task, are encouraging. There are, however, limitations due to small evaluation set size. The analysis shows that further refinement of the methods is necessary before they can be really useful in a clinical setting. Conclusion Medical case-based literature retrieval is a critical search application that presents a number of unique challenges. This analysis shows that the state-ofthe- art general retrieval models are reasonably good for the task, but the performance can be significantly improved by developing new task-specific retrieval models that incorporate medical thesauri and physician feedback.
UR - http://www.scopus.com/inward/record.url?scp=84872241146&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84872241146&partnerID=8YFLogxK
U2 - 10.1136/amiajnl-2011-000293
DO - 10.1136/amiajnl-2011-000293
M3 - Article
C2 - 22437075
AN - SCOPUS:84872241146
SN - 1067-5027
VL - 19
SP - 851
EP - 858
JO - Journal of the American Medical Informatics Association
JF - Journal of the American Medical Informatics Association
IS - 5
ER -