Understanding how languages change is important not only for the reconstruction of protolanguages and for estimating diversification dates (i.e. the dates when languages split), but also for the inference of evolutionary trees (or phylogenetic networks) of language families. We propose a parametric model of language change that addresses lexical polymorphism (two or more words for a given basic meaning) based on what is known about how languages change. Under our model, changes of state in lexical characters occur only due to semantic shift or borrowing, leading to (potentially brief) periods in which polymorphism is present. Across a wide range of model conditions, we find that a simple and natural modification to the maximum parsimony (MP) criterion (which seeks the tree with the fewest number of changes) to allow it to handle polymorphic characters has the best accuracy, substantially improving on well-known Bayesian methods based on appearances and disappearances of words. We also provide a new analysis of Indo–European that takes polymorphism into account, finding support for a previous tree (Nakhleh et al., 2006) and a new tree that differs from the previous tree in the relationship between Italo-Celtic and Tocharian.

Original languageEnglish (US)
JournalTransactions of the Philological Society
StateE-pub ahead of print - Apr 9 2024


Dive into the research topics of 'Addressing Polymorphism in Linguistic Phylogenetics'. Together they form a unique fingerprint.

Cite this