Using knowledge and constraints to find the best antecedent

Prateek Jindal, Dan Roth

Research output: Contribution to conferencePaperpeer-review

Abstract

Coreference resolution is the problem of clustering mentions into entities and is very critical for natural language understanding. This paper studies the problem of coreference resolution in the context of the newly emerging domain of Electronic Health Records (EHRs). The commonly used "best-link" model for coreference resolution considers only the scores from a pairwise classifier in selecting the best antecedent. In this paper, we extend this model to include several constraints derived from surface-form of the mentions and the context in which they appear. Another major contribution of this paper is to show the use of domain-specific knowledge sources, mention parsing and clinical descriptors in deriving features which contribute to improved coreference resolution performance. We present experiments on 4 different clinical datasets illustrating that our approach outperforms a strong baseline and a state-of-the-art system by a wide margin.

Original languageEnglish (US)
Pages1327-1342
Number of pages16
StatePublished - Dec 1 2012
Event24th International Conference on Computational Linguistics, COLING 2012 - Mumbai, India
Duration: Dec 8 2012Dec 15 2012

Other

Other24th International Conference on Computational Linguistics, COLING 2012
CountryIndia
CityMumbai
Period12/8/1212/15/12

Keywords

  • Coreference resolution
  • Electronic health records
  • Information extraction
  • Knowledge based systems
  • Natural language processing

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'Using knowledge and constraints to find the best antecedent'. Together they form a unique fingerprint.

Cite this