TY - GEN
T1 - Improved finite-state morphological analysis for St. Lawrence island yupik using paradigm function morphology
AU - Chen, Emily
AU - Park, Hyunji Hayley
AU - Schwartz, Lane
N1 - Publisher Copyright:
© European Language Resources Association (ELRA), licensed under CC-BY-NC
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020
Y1 - 2020
N2 - St. Lawrence Island Yupik is an endangered polysynthetic language of the Bering Strait region. While conducting linguistic fieldwork between 2016 and 2019, we observed substantial support within the Yupik community for language revitalization and for resource development to support Yupik education. To that end, Chen and Schwartz (2018) implemented a finite-state morphological analyzer as a critical enabling technology for use in Yupik language education and technology. Chen and Schwartz (2018) reported a morphological analysis coverage rate of approximately 75% on a dataset of 60K Yupik tokens, leaving considerable room for improvement. In this work, we present a re-implementation of the Chen and Schwartz (2018) finite-state morphological analyzer for St. Lawrence Island Yupik that incorporates new linguistic insights; in particular, in this implementation we make use of the Paradigm Function Morphology (PFM) theory of morphology. We evaluate this new PFM-based morphological analyzer, and demonstrate that it consistently outperforms the existing analyzer of Chen and Schwartz (2018) with respect to accuracy and coverage rate across multiple datasets.
AB - St. Lawrence Island Yupik is an endangered polysynthetic language of the Bering Strait region. While conducting linguistic fieldwork between 2016 and 2019, we observed substantial support within the Yupik community for language revitalization and for resource development to support Yupik education. To that end, Chen and Schwartz (2018) implemented a finite-state morphological analyzer as a critical enabling technology for use in Yupik language education and technology. Chen and Schwartz (2018) reported a morphological analysis coverage rate of approximately 75% on a dataset of 60K Yupik tokens, leaving considerable room for improvement. In this work, we present a re-implementation of the Chen and Schwartz (2018) finite-state morphological analyzer for St. Lawrence Island Yupik that incorporates new linguistic insights; in particular, in this implementation we make use of the Paradigm Function Morphology (PFM) theory of morphology. We evaluate this new PFM-based morphological analyzer, and demonstrate that it consistently outperforms the existing analyzer of Chen and Schwartz (2018) with respect to accuracy and coverage rate across multiple datasets.
KW - Computational morphology
KW - Language revitalization
KW - Linguistic resource
KW - Morphological analysis
KW - Yupik languages
UR - http://www.scopus.com/inward/record.url?scp=85096567112&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85096567112&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85096567112
T3 - LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
SP - 2676
EP - 2684
BT - LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
A2 - Calzolari, Nicoletta
A2 - Bechet, Frederic
A2 - Blache, Philippe
A2 - Choukri, Khalid
A2 - Cieri, Christopher
A2 - Declerck, Thierry
A2 - Goggi, Sara
A2 - Isahara, Hitoshi
A2 - Maegaard, Bente
A2 - Mariani, Joseph
A2 - Mazo, Helene
A2 - Moreno, Asuncion
A2 - Odijk, Jan
A2 - Piperidis, Stelios
PB - European Language Resources Association (ELRA)
T2 - 12th International Conference on Language Resources and Evaluation, LREC 2020
Y2 - 11 May 2020 through 16 May 2020
ER -