Arabic named entity recognition: Whatworks and what's next

Liyuan Liu, Jingbo Shang, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents the winning solution to the Arabic Named Entity Recognition challenge run by Topcoder.com. The proposed model integrates various tailored techniques together, including representation learning, feature engineering, sequence labeling, and ensemble learning. The final model achieves a test F1 score of 75:82% on the AQMAR dataset and outperforms baselines by a large margin. Detailed analyses are conducted to reveal both its strengths and limitations. Specifically, we observe that (1) representation learning modules can significantly boost the performance but requires a proper pre-processing and (2) the resulting embedding can be further enhanced with feature engineering due to the limited size of the training data. All implementations and pre-trained models are made public.

Original languageEnglish (US)
Title of host publicationACL 2019 - 4th Arabic Natural Language Processing Workshop, WANLP 2019 - Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages60-67
Number of pages8
ISBN (Electronic)9781950737321
StatePublished - 2019
Event4th Arabic Natural Language Processing Workshop, WANLP 2019, held at ACL 2019 - Florence, Italy
Duration: Aug 1 2019 → …

Publication series

NameACL 2019 - 4th Arabic Natural Language Processing Workshop, WANLP 2019 - Proceedings of the Workshop

Conference

Conference4th Arabic Natural Language Processing Workshop, WANLP 2019, held at ACL 2019
Country/TerritoryItaly
CityFlorence
Period8/1/19 → …

ASJC Scopus subject areas

  • Software
  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Arabic named entity recognition: Whatworks and what's next'. Together they form a unique fingerprint.

Cite this