Improving Measures of Text Reuse in English Poetry: A TF–IDF Based Method

Wenyi Shang, Ted Underwood

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Text reuse measurement is important for both LIS and literary studies, where it is mainly used to study influence between authors. Although projects such as Tesserae have already adopted computational methods for investigating text reuse in Latin poetry, its potential applications to the rich collections of English poetry have not been realized. This research proposes a modified version of the Tesserae Project’s measure based on the insight embodied in TF–IDF to study English poetry. Using the Irish poet Yeats’ relationship to five English Romantic poets as a test case, three parallel experiments were conducted in order to evaluate the suitability of this method for English poetry. The results show that this new method is effective in measuring text reuse in English poetry, and the TF–IDF based modification is more sensitive to known cases of text reuse than the original method. This method can also be adopted to noncanonical literary works in the future, providing an example of the significance of LIS for digital humanities.

Original languageEnglish (US)
Title of host publicationDiversity, Divergence, Dialogue - 16th International Conference, iConference 2021, Proceedings
EditorsKatharina Toeppe, Hui Yan, Samuel Kai Chu
PublisherSpringer
Pages469-477
Number of pages9
ISBN (Print)9783030712914
DOIs
StatePublished - 2021
Event16th International Conference on Diversity, Divergence, Dialogue, iConference 2021 - Beijing, China
Duration: Mar 17 2021Mar 31 2021

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume12645 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference16th International Conference on Diversity, Divergence, Dialogue, iConference 2021
Country/TerritoryChina
CityBeijing
Period3/17/213/31/21

Keywords

  • Digital humanities
  • English poetry
  • Method evaluation
  • TF–IDF
  • Text reuse

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Improving Measures of Text Reuse in English Poetry: A TF–IDF Based Method'. Together they form a unique fingerprint.

Cite this