Language Contact and Population Contact as Sources of Dialect Similarity

Research output: Contribution to journalArticlepeer-review

Abstract

This paper creates a global similarity network between city-level dialects of English in order to determine whether external factors like the amount of population contact or language contact influence dialect similarity. While previous computational work has focused on external influences that contribute to phonological or lexical similarity, this paper focuses on grammatical variation as operationalized in computational construction grammar. Social media data was used to create comparable English corpora from 256 cities across 13 countries. Each sample is represented using the type frequency of various constructions. These frequency representations are then used to calculate pairwise similarities between city-level dialects; a prediction-based evaluation shows that these similarity values are highly accurate. Linguistic similarity is then compared with four external factors: (i) the amount of air travel between cities, a proxy for population contact, (ii) the difference in the linguistic landscapes of each city, a proxy for language contact, (iii) the geographic distance between cities, and (iv) the presence of political boundaries separating cities. The results show that, while all these factors are significant, the best model relies on language contact and geographic distance.
Original languageEnglish (US)
Article number188
JournalLanguages
Volume10
Issue number8
DOIs
StatePublished - Jul 2025

Keywords

  • dialect similarity
  • construction grammar
  • language contact
  • population contact
  • computational sociolinguistics

Fingerprint

Dive into the research topics of 'Language Contact and Population Contact as Sources of Dialect Similarity'. Together they form a unique fingerprint.

Cite this