Stability of Syntactic Dialect Classification Over Space and Time

Jonathan Dunn, Sidney Wong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper analyses the degree to which dialect classifiers based on syntactic representations remain stable over space and time. While previous work has shown that the combination of grammar induction and geospatial text classification produces robust dialect models, we do not know what influence both changing grammars and changing populations have on dialect models. This paper constructs a test set for 12 dialects of English that spans three years at monthly intervals with a fixed spatial distribution across 1,120 cities. Syntactic representations are formulated within the usage-based Construction Grammar paradigm (CxG). The decay rate of classification performance for each dialect over time allows us to identify regions undergoing syntactic change. And the distribution of classification accuracy within dialect regions allows us to identify the degree to which the grammar of a dialect is internally heterogeneous. The main contribution of this paper is to show that a rigorous evaluation of dialect classification models can be used to find both variation over space and change over time.

Original languageEnglish (US)
Title of host publicationProceedings of the 29th International Conference on Computational Linguistics
PublisherInternational Committee on Computational Linguistics
Pages26-36
Number of pages11
StatePublished - 2022
Externally publishedYes
Event29th International Conference on Computational Linguistics, COLING 2022 - Gyeongju, Korea, Republic of
Duration: Oct 12 2022Oct 17 2022

Publication series

NameProceedings - International Conference on Computational Linguistics, COLING
ISSN (Print)2951-2093

Conference

Conference29th International Conference on Computational Linguistics, COLING 2022
Country/TerritoryKorea, Republic of
CityGyeongju
Period10/12/2210/17/22

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'Stability of Syntactic Dialect Classification Over Space and Time'. Together they form a unique fingerprint.

Cite this