Zero-shot cross-lingual name retrieval for low-resource languages

Kevin Blissett, Heng Ji

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we address a challenging cross-lingual name retrieval task. Given an English named entity query, we aim to find all name mentions in documents in low-resource languages. We present a novel method which relies on zero annotation or resources from the target language. By leveraging freely available, cross-lingual resources and a small amount of training data from another language, we are able to perform name retrieval on a new language without any additional training data. Our method proceeds in a multi-step process: first, we pretrain a language-independent orthographic encoder using Wikipedia inter-lingual links from dozens of languages. Next, we gather user expectations about important entities in an English comparable document and compare those expected entities with actual spans of the target language text in order to perform name finding. Our method shows 11.6% absolute F-score improvement over state-of-the-art methods.

Original languageEnglish (US)
Title of host publicationDeepLo@EMNLP-IJCNLP 2019 - Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource Natural Language Processing - Proceedings
PublisherAssociation for Computational Linguistics (ACL)
Pages275-280
Number of pages6
ISBN (Electronic)9781950737789
StatePublished - 2021
Event2nd Workshop on Deep Learning Approaches for Low-Resource Natural Language Processing, DeepLo@EMNLP-IJCNLP 2019 - Hong Kong, China
Duration: Nov 3 2019 → …

Publication series

NameDeepLo@EMNLP-IJCNLP 2019 - Proceedings of the 2nd Workshop on Deep Learning Approaches for Low-Resource Natural Language Processing - Proceedings

Conference

Conference2nd Workshop on Deep Learning Approaches for Low-Resource Natural Language Processing, DeepLo@EMNLP-IJCNLP 2019
Country/TerritoryChina
CityHong Kong
Period11/3/19 → …

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software

Fingerprint

Dive into the research topics of 'Zero-shot cross-lingual name retrieval for low-resource languages'. Together they form a unique fingerprint.

Cite this