TY - JOUR
T1 - Extraction of events and temporal expressions from clinical narratives
AU - Jindal, Prateek
AU - Roth, Dan
N1 - Funding Information:
The authors thank the anonymous reviewers for their valuable suggestions. This research was supported by Grant HHS 90TR0003/01 and by the Intelligence Advanced Research Projects Activity (IARPA) Foresight and Understanding from Scientific Exposition (FUSE) Program via Department of Interior National Business Center contract number D11PC2015. In addition, the i2b2 challenge was supported by Grants NIH NLM 2U54LM008748 (PI: Isaac Kohane) and NIH NLM 1R13LM011411-01 (PI: Ozlem Uzuner). The contents of this paper are solely the responsibility of the authors and do not necessarily represent the official views of the HHS, IARPA, DoI/NBC, NIH, NLM or the US government.
PY - 2013/12
Y1 - 2013/12
N2 - •Standard approaches for event extraction consider each event in isolation.•We design a sentence-level inference strategy for event extraction.•We use MeSH and SNOMED CT to design clinical descriptors.•We give a robust algorithm for date extraction.•Several rules were developed to extract and normalize complex temporal expressions. This paper addresses an important task of event and timex extraction from clinical narratives in context of the i2b2 2012 challenge. State-of-the-art approaches for event extraction use a multi-class classifier for finding the event types. However, such approaches consider each event in isolation. In this paper, we present a sentence-level inference strategy which enforces consistency constraints on attributes of those events which appear close to one another. Our approach is general and can be used for other tasks as well. We also design novel features like clinical descriptors (from medical ontologies) which encode a lot of useful information about the concepts. For timex extraction, we adapt a state-of-the-art system, HeidelTime, for use in clinical narratives and also develop several rules which complement HeidelTime. We also give a robust algorithm for date extraction. For the event extraction task, we achieved an overall F1 score of 0.71 for determining span of the events along with their attributes. For the timex extraction task, we achieved an F1 score of 0.79 for determining span of the temporal expressions. We present detailed error analysis of our system and also point out some factors which can help to improve its accuracy.
AB - •Standard approaches for event extraction consider each event in isolation.•We design a sentence-level inference strategy for event extraction.•We use MeSH and SNOMED CT to design clinical descriptors.•We give a robust algorithm for date extraction.•Several rules were developed to extract and normalize complex temporal expressions. This paper addresses an important task of event and timex extraction from clinical narratives in context of the i2b2 2012 challenge. State-of-the-art approaches for event extraction use a multi-class classifier for finding the event types. However, such approaches consider each event in isolation. In this paper, we present a sentence-level inference strategy which enforces consistency constraints on attributes of those events which appear close to one another. Our approach is general and can be used for other tasks as well. We also design novel features like clinical descriptors (from medical ontologies) which encode a lot of useful information about the concepts. For timex extraction, we adapt a state-of-the-art system, HeidelTime, for use in clinical narratives and also develop several rules which complement HeidelTime. We also give a robust algorithm for date extraction. For the event extraction task, we achieved an overall F1 score of 0.71 for determining span of the events along with their attributes. For the timex extraction task, we achieved an F1 score of 0.79 for determining span of the temporal expressions. We present detailed error analysis of our system and also point out some factors which can help to improve its accuracy.
KW - Electronic health records
KW - Information extraction
KW - Integer quadratic programmming
KW - Named entity recognition
KW - Natural language processing
KW - Temporal extraction
UR - http://www.scopus.com/inward/record.url?scp=84897030608&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84897030608&partnerID=8YFLogxK
U2 - 10.1016/j.jbi.2013.08.010
DO - 10.1016/j.jbi.2013.08.010
M3 - Article
C2 - 24022023
AN - SCOPUS:84897030608
SN - 1532-0464
VL - 46
SP - S13-S19
JO - Journal of Biomedical Informatics
JF - Journal of Biomedical Informatics
IS - SUPPL.
ER -