Abstract
Coreference resolution is the problem of clustering mentions into entities and is very critical for natural language understanding. This paper studies the problem of coreference resolution in the context of the newly emerging domain of Electronic Health Records (EHRs). The commonly used "best-link" model for coreference resolution considers only the scores from a pairwise classifier in selecting the best antecedent. In this paper, we extend this model to include several constraints derived from surface-form of the mentions and the context in which they appear. Another major contribution of this paper is to show the use of domain-specific knowledge sources, mention parsing and clinical descriptors in deriving features which contribute to improved coreference resolution performance. We present experiments on 4 different clinical datasets illustrating that our approach outperforms a strong baseline and a state-of-the-art system by a wide margin.
Original language | English (US) |
---|---|
Pages | 1327-1342 |
Number of pages | 16 |
State | Published - 2012 |
Event | 24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India Duration: Dec 8 2012 → Dec 15 2012 |
Other
Other | 24th International Conference on Computational Linguistics, COLING 2012 |
---|---|
Country/Territory | India |
City | Mumbai |
Period | 12/8/12 → 12/15/12 |
Keywords
- Coreference resolution
- Electronic health records
- Information extraction
- Knowledge based systems
- Natural language processing
ASJC Scopus subject areas
- Computational Theory and Mathematics
- Language and Linguistics
- Linguistics and Language