TY - GEN
T1 - Coreference Resolution for Structured Drug Product Labels
AU - Kilicoglu, Halil
AU - Demner-Fushman, Dina
N1 - Publisher Copyright:
©2014 Association for Computational Linguistics
PY - 2014
Y1 - 2014
N2 - FDA drug package inserts provide comprehensive and authoritative information about drugs. DailyMed database is a repository of structured product labels extracted from these package inserts. Most salient information about drugs remains in free text portions of these labels. Extracting information from these portions can improve the safety and quality of drug prescription. In this paper, we present a study that focuses on resolution of coreferential information from drug labels contained in DailyMed. We generalized and expanded an existing rule-based coreference resolution module for this purpose. Enhancements include resolution of set/instance anaphora, recognition of appositive constructions and wider use of UMLS semantic knowledge. We obtained an improvement of 40% over the baseline with unweighted average F1-measure using B-CUBED, MUC, and CEAF metrics. The results underscore the importance of set/instance anaphora and appositive constructions in this type of text and point out the shortcomings in coreference annotation in the dataset.
AB - FDA drug package inserts provide comprehensive and authoritative information about drugs. DailyMed database is a repository of structured product labels extracted from these package inserts. Most salient information about drugs remains in free text portions of these labels. Extracting information from these portions can improve the safety and quality of drug prescription. In this paper, we present a study that focuses on resolution of coreferential information from drug labels contained in DailyMed. We generalized and expanded an existing rule-based coreference resolution module for this purpose. Enhancements include resolution of set/instance anaphora, recognition of appositive constructions and wider use of UMLS semantic knowledge. We obtained an improvement of 40% over the baseline with unweighted average F1-measure using B-CUBED, MUC, and CEAF metrics. The results underscore the importance of set/instance anaphora and appositive constructions in this type of text and point out the shortcomings in coreference annotation in the dataset.
UR - http://www.scopus.com/inward/record.url?scp=85093067804&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85093067804&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85093067804
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 45
EP - 53
BT - ACL 2014 - BioNLP 2014, Workshop on Biomedical Natural Language Processing, Proceedings of the Workshop
PB - Association for Computational Linguistics (ACL)
T2 - ACL 2014 Workshop on Biomedical Natural Language Processing, BioNLP 2014
Y2 - 27 June 2014 through 28 June 2014
ER -