A Robust Risk Minimization based Named Entity Recognition System

Tong Zhang, David Johnson

Research output: Contribution to conferencePaperpeer-review

Abstract

This paper describes a robust linear classification system for Named Entity Recognition. A similar system has been applied to the CoNLL text chunking shared task with state of the art performance. By using different linguistic features, we can easily adapt this system to other token-based linguistic tagging problems. The main focus of the current paper is to investigate the impact of various local linguistic features for named entity recognition on the CoNLL-2003 (Tjong Kim Sang and De Meulder, 2003) shared task data. We show that the system performance can be enhanced significantly with some relative simple token-based features that are available for many languages. Although more sophisticated linguistic features will also be helpful, they provide much less improvement than might be expected.

Original languageEnglish (US)
Pages204-207
Number of pages4
StatePublished - 2003
Externally publishedYes
Event7th Conference on Natural Language Learning, CoNLL 2003 at HLT-NAACL 2003 - Edmonton, Canada
Duration: May 31 2003Jun 1 2003

Conference

Conference7th Conference on Natural Language Learning, CoNLL 2003 at HLT-NAACL 2003
Country/TerritoryCanada
CityEdmonton
Period5/31/036/1/03

ASJC Scopus subject areas

  • Management Science and Operations Research
  • Computer Graphics and Computer-Aided Design
  • Computer Vision and Pattern Recognition
  • Modeling and Simulation

Fingerprint

Dive into the research topics of 'A Robust Risk Minimization based Named Entity Recognition System'. Together they form a unique fingerprint.

Cite this