TY - GEN
T1 - Non-compositional Expression Generation and its Continual Learning
AU - Zhou, Jianing
AU - Bhat, Suma
N1 - Publisher Copyright:
© 2024 Association for Computational Linguistics.
PY - 2024
Y1 - 2024
N2 - Non-compositional expressions, such as idioms, are an integral part of natural language and their figurative meanings cannot be directly derived from the meanings of their component words. Considering the scenario, where these expressions form a long-tailed process in language, either because of their occurrence in corpora and/or their gradual integration into use over time, this paper studies the ability of contemporary pre-trained language models to continually learn them and generate them. Formulating this as a mask infilling task termed as CLoNE, the study probes the combined challenges of non-compositionality and their continual learning. Using a set of three diverse idiomatic expression datasets repurposed for this task, we benchmark different large pre-trained language models and different continual learning methods on the task of non-compositional expression generation. Our experiments on the CLoNE task show that pre-trained language models are limited in their ability to generate non-compositional expressions and available continual learning methods are inadequate for our proposed CLoNE task, calling for more effective methods for continual learning of non-compositionality. Our datasets and code will be available at https://github.com/zhjjn/ContinualGeneration.git.
AB - Non-compositional expressions, such as idioms, are an integral part of natural language and their figurative meanings cannot be directly derived from the meanings of their component words. Considering the scenario, where these expressions form a long-tailed process in language, either because of their occurrence in corpora and/or their gradual integration into use over time, this paper studies the ability of contemporary pre-trained language models to continually learn them and generate them. Formulating this as a mask infilling task termed as CLoNE, the study probes the combined challenges of non-compositionality and their continual learning. Using a set of three diverse idiomatic expression datasets repurposed for this task, we benchmark different large pre-trained language models and different continual learning methods on the task of non-compositional expression generation. Our experiments on the CLoNE task show that pre-trained language models are limited in their ability to generate non-compositional expressions and available continual learning methods are inadequate for our proposed CLoNE task, calling for more effective methods for continual learning of non-compositionality. Our datasets and code will be available at https://github.com/zhjjn/ContinualGeneration.git.
UR - http://www.scopus.com/inward/record.url?scp=85205320720&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85205320720&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85205320720
T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics
SP - 2828
EP - 2839
BT - 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024 - Proceedings of the Conference
A2 - Ku, Lun-Wei
A2 - Martins, Andre
A2 - Srikumar, Vivek
PB - Association for Computational Linguistics (ACL)
T2 - Findings of the 62nd Annual Meeting of the Association for Computational Linguistics, ACL 2024
Y2 - 11 August 2024 through 16 August 2024
ER -