TY - JOUR
T1 - Using domain knowledge and domain-inspired discourse model for coreference resolution for clinical narratives
AU - Jindal, Prateek
AU - Roth, Dan
PY - 2013
Y1 - 2013
N2 - Objective: This paper presents a coreference resolutionsystem for clinical narratives. Coreference resolutionaims at clustering all mentions in a single document tocoherent entities. Materials and methods: A knowledge-intensiveapproach for coreference resolution is employed. Thedomain knowledge used includes several domain-specificlists, a knowledge intensive mention parsing, and taskinformed discourse model. Mention parsing allows us toabstract over the surface form of the mention andrepresent each mention using a higher-levelrepresentation, which we call the mention's semanticrepresentation (SR). SR reduces the mention toa standard form and hence provides better support forcomparing and matching. Existing coreference resolutionsystems tend to ignore discourse aspects and relyheavily on lexical and structural cues in the text. Theauthors break from this tradition and present a discoursemodel for "person" type mentions in clinical narratives,which greatly simplifies the coreference resolution. Results: This system was evaluated on four differentdatasets which were made available in the 2011 i2b2/VAcoreference challenge. The unweighted average of F1scores (over B-cubed, MUC and CEAF) varied from 84.2% to 88.1%. These experiments show that domainknowledge is effective for different mention types for allthe datasets. Discussion: Error analysis shows that most of the recallerrors made by the system can be handled by furtheraddition of domain knowledge. The precision errors, onthe other hand, are more subtle and indicate the need tounderstand the relations in which mentions participatefor building a robust coreference system. Conclusion: This paper presents an approach thatmakes an extensive use of domain knowledge tosignificantly improve coreference resolution. The authorsstate that their system and the knowledge sourcesdeveloped will be made publicly available.
AB - Objective: This paper presents a coreference resolutionsystem for clinical narratives. Coreference resolutionaims at clustering all mentions in a single document tocoherent entities. Materials and methods: A knowledge-intensiveapproach for coreference resolution is employed. Thedomain knowledge used includes several domain-specificlists, a knowledge intensive mention parsing, and taskinformed discourse model. Mention parsing allows us toabstract over the surface form of the mention andrepresent each mention using a higher-levelrepresentation, which we call the mention's semanticrepresentation (SR). SR reduces the mention toa standard form and hence provides better support forcomparing and matching. Existing coreference resolutionsystems tend to ignore discourse aspects and relyheavily on lexical and structural cues in the text. Theauthors break from this tradition and present a discoursemodel for "person" type mentions in clinical narratives,which greatly simplifies the coreference resolution. Results: This system was evaluated on four differentdatasets which were made available in the 2011 i2b2/VAcoreference challenge. The unweighted average of F1scores (over B-cubed, MUC and CEAF) varied from 84.2% to 88.1%. These experiments show that domainknowledge is effective for different mention types for allthe datasets. Discussion: Error analysis shows that most of the recallerrors made by the system can be handled by furtheraddition of domain knowledge. The precision errors, onthe other hand, are more subtle and indicate the need tounderstand the relations in which mentions participatefor building a robust coreference system. Conclusion: This paper presents an approach thatmakes an extensive use of domain knowledge tosignificantly improve coreference resolution. The authorsstate that their system and the knowledge sourcesdeveloped will be made publicly available.
UR - http://www.scopus.com/inward/record.url?scp=84874765526&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84874765526&partnerID=8YFLogxK
U2 - 10.1136/amiajnl-2011-000767
DO - 10.1136/amiajnl-2011-000767
M3 - Article
C2 - 22781192
AN - SCOPUS:84874765526
SN - 1067-5027
VL - 20
SP - 356
EP - 362
JO - Journal of the American Medical Informatics Association
JF - Journal of the American Medical Informatics Association
IS - 2
ER -