Disentangling semantic and prosodic features of English poetry

Wenyi Shang, Ted Underwood

Research output: Contribution to journalArticlepeer-review


The distinction between genre and form is still contested in literary studies. While scholars associated with the New Formalism are criticized for perceiving everything as a form, digital humanists tend to argue that everything is a genre. In this research, we employed machine learning models to classify 36,635 English poems in the Chadwyck-Healey Literature Collections into twenty-seven categories, focusing on their semantic features (lexicons) and prosodic features (meters and rhymes) independently. Our findings reveal that different categories of poetry are distinguished by different groups of characteristics, without a clear-cut division between those driven predominantly by semantic features and those driven predominantly by prosodic features. Instead, poetry categories manifest a combination of semantic and prosodic elements, spanning a spectrum of different strengths in both domains. These findings suggest that the colloquial distinction between “genre” and “form” is based on real differences between poetic categories, although those differences may not be quite as crisply binary as the vocabulary implies.
Original languageEnglish (US)
Article numberfqae008
JournalDigital Scholarship in the Humanities
StateE-pub ahead of print - Feb 27 2024


Dive into the research topics of 'Disentangling semantic and prosodic features of English poetry'. Together they form a unique fingerprint.

Cite this