Learned Construction Grammars Converge Across Registers Given Increased Exposure

Jonathan Dunn, Harish Tayyar Madabushi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper measures the impact of increased exposure on whether learned construction grammars converge onto shared representations when trained on data from different registers. Register influences the frequency of constructions, with some structures common in formal but not informal usage. We expect that a grammar induction algorithm exposed to different registers will acquire different constructions. To what degree does increased exposure lead to the convergence of register-specific grammars? The experiments in this paper simulate language learning in 12 languages (half Germanic and half Romance) with corpora representing three registers (Twitter, Wikipedia, Web). These simulations are repeated with increasing amounts of exposure, from 100k to 2 million words, to measure the impact of exposure on the convergence of grammars. The results show that increased exposure does lead to converging grammars across all languages. In addition, a shared core of register-universal constructions remains constant across increasing amounts of exposure.

Original languageEnglish (US)
Title of host publicationCoNLL 2021 - 25th Conference on Computational Natural Language Learning, Proceedings
EditorsArianna Bisazza, Omri Abend
PublisherAssociation for Computational Linguistics (ACL)
Pages268-278
Number of pages11
ISBN (Electronic)9781955917056
DOIs
StatePublished - 2021
Externally publishedYes
Event25th Conference on Computational Natural Language Learning, CoNLL 2021 - Virtual, Online
Duration: Nov 10 2021Nov 11 2021

Conference

Conference25th Conference on Computational Natural Language Learning, CoNLL 2021
CityVirtual, Online
Period11/10/2111/11/21

ASJC Scopus subject areas

  • Artificial Intelligence
  • Human-Computer Interaction
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Learned Construction Grammars Converge Across Registers Given Increased Exposure'. Together they form a unique fingerprint.

Cite this