TY - GEN
T1 - Error analysis of Uyghur name tagging
T2 - 11th International Conference on Language Resources and Evaluation, LREC 2018
AU - Halidanmu, Abudukelimu
AU - Abudoukelimu, Abulizi
AU - Zhang, Boliang
AU - Pan, Xiaoman
AU - Lu, Di
AU - Ji, Heng
AU - Liu, Yang
N1 - Publisher Copyright:
© LREC 2018 - 11th International Conference on Language Resources and Evaluation. All rights reserved.
PY - 2019
Y1 - 2019
N2 - Regardless of numerous efforts at name tagging for Uyghur, there is limited understanding on the performance ceiling. In this paper, we take a close look at the successful cases and perform careful analysis on the remaining errors of a state-of-the-art Uyghur name tagger, systematically categorize challenges, and propose possible solutions. We conclude that simply adopting a machine learning model which is proven successful for high-resource languages along with language-independent superficial features is unlikely to be effective for Uyghur, or low-resource languages in general. Further advancement requires exploiting rich language-specific knowledge and non-traditional linguistic resources, and novel methods to encode them into machine learning frameworks.
AB - Regardless of numerous efforts at name tagging for Uyghur, there is limited understanding on the performance ceiling. In this paper, we take a close look at the successful cases and perform careful analysis on the remaining errors of a state-of-the-art Uyghur name tagger, systematically categorize challenges, and propose possible solutions. We conclude that simply adopting a machine learning model which is proven successful for high-resource languages along with language-independent superficial features is unlikely to be effective for Uyghur, or low-resource languages in general. Further advancement requires exploiting rich language-specific knowledge and non-traditional linguistic resources, and novel methods to encode them into machine learning frameworks.
KW - Error analysis
KW - Low-resource languages
KW - Name tagging
UR - http://www.scopus.com/inward/record.url?scp=85059879006&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059879006&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85059879006
T3 - LREC 2018 - 11th International Conference on Language Resources and Evaluation
SP - 4421
EP - 4426
BT - LREC 2018 - 11th International Conference on Language Resources and Evaluation
A2 - Isahara, Hitoshi
A2 - Maegaard, Bente
A2 - Piperidis, Stelios
A2 - Cieri, Christopher
A2 - Declerck, Thierry
A2 - Hasida, Koiti
A2 - Mazo, Helene
A2 - Choukri, Khalid
A2 - Goggi, Sara
A2 - Mariani, Joseph
A2 - Moreno, Asuncion
A2 - Calzolari, Nicoletta
A2 - Odijk, Jan
A2 - Tokunaga, Takenobu
PB - European Language Resources Association (ELRA)
Y2 - 7 May 2018 through 12 May 2018
ER -