Abstract
As a key natural language processing (NLP) task, word sense disambiguation (WSD) evaluates how well NLP models can understand the lexical semantics of words under specific contexts. Benefited from the large-scale annotation, current WSD systems have achieved impressive performances in English by combining supervised learning with lexical knowledge. However, such success is hard to be replicated in other languages, where we only have limited annotations. In this paper, based on the multilingual lexicon BabelNet describing the same set of concepts across languages, we propose building knowledge and supervised-based Multilingual Word Sense Disambiguation (MWSD) systems. We build unified sense representations for multiple languages and address the annotation scarcity problem for MWSD by transferring annotations from rich-sourced languages to poorer ones. With the unified sense representations, annotations from multiple languages can be jointly trained to benefit the MWSD tasks. Evaluations of SemEval-13 and SemEval-15 datasets demonstrate the effectiveness of our methodology.
Original language | English (US) |
---|---|
Pages (from-to) | 4193-4202 |
Number of pages | 10 |
Journal | Proceedings - International Conference on Computational Linguistics, COLING |
Volume | 29 |
Issue number | 1 |
State | Published - 2022 |
Externally published | Yes |
Event | 29th International Conference on Computational Linguistics, COLING 2022 - Gyeongju, Korea, Republic of Duration: Oct 12 2022 → Oct 17 2022 |
ASJC Scopus subject areas
- Computational Theory and Mathematics
- Computer Science Applications
- Theoretical Computer Science