Transliteration as constrained optimization

Dan Goldwasser, Dan Roth

Research output: Contribution to conferencePaperpeer-review

Abstract

This paper introduces a new method for identifying named-entity (NE) transliterations in bilingual corpora. Recent works have shown the advantage of discriminative approaches to transliteration: given two strings (w s,wt) in the source and target language, a classifier is trained to determine if wt is the transliteration of ws. This paper shows that the transliteration problem can be formulated as a constrained optimization problem and thus take into account contextual dependencies and constraints among character bi-grams in the two strings. We further explore several methods for learning the objective function of the optimization problem and show the advantage of learning it discriminately. Our experiments show that the new framework results in over 50% improvement in translating English NEs to Hebrew.

Original languageEnglish (US)
Pages353-362
Number of pages10
DOIs
StatePublished - 2008
Externally publishedYes
Event2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation - Honolulu, HI, United States
Duration: Oct 25 2008Oct 27 2008

Other

Other2008 Conference on Empirical Methods in Natural Language Processing, EMNLP 2008, Co-located with AMTA 2008 and the International Workshop on Spoken Language Translation
Country/TerritoryUnited States
CityHonolulu, HI
Period10/25/0810/27/08

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Transliteration as constrained optimization'. Together they form a unique fingerprint.

Cite this