On the role of poetic versus nonpoetic features in "kindred" and diachronic poetry attribution

Brent D. Fegley, Vetle Ingvald Torvik

Research output: Contribution to journalArticle

Abstract

Author attribution studies have demonstrated remarkable success in applying orthographic and lexicographic features of text in a variety of discrimination problems. What might poetic features, such as syllabic stress and mood, contribute? We address this question in the context of two different attribution problems: (a) kindred: differentiate Langston Hughes' early poems from those of kindred poets and (b) diachronic: differentiate Hughes' early from his later poems. Using a diverse set of 535 generic text features, each categorized as poetic or nonpoetic, correlation-based greedy forward search ranked the features and a support vector machine classified the poems. A small subset of features (∼10) achieved cross-validated precision and recall as high as 87%. Poetic features (rhyme patterns particularly) were nearly as effective as nonpoetic in kindred discrimination, but less effective diachronically. In other words, Hughes used both poetic and nonpoetic features in distinctive ways and his use of nonpoetic features evolved systematically while he continued to experiment with poetic features. These findings affirm qualitative studies attesting to structural elements from Black oral tradition and Black folk music (blues) and to the internal consistency of Hughes' early poetry.

Original languageEnglish (US)
Pages (from-to)2165-2181
Number of pages17
JournalJournal of the American Society for Information Science and Technology
Volume63
Issue number11
DOIs
StatePublished - Nov 1 2012

Fingerprint

attribution
poetry
Support vector machines
discrimination
folk music
mood
Experiments
writer
experiment
Poetics
Attribution
Poetry
Discrimination

Keywords

  • computational linguistics
  • machine learning
  • natural language processing

ASJC Scopus subject areas

  • Software
  • Information Systems
  • Human-Computer Interaction
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

@article{1a6a65ff95404c5595d179eb3eaa76a6,
title = "On the role of poetic versus nonpoetic features in {"}kindred{"} and diachronic poetry attribution",
abstract = "Author attribution studies have demonstrated remarkable success in applying orthographic and lexicographic features of text in a variety of discrimination problems. What might poetic features, such as syllabic stress and mood, contribute? We address this question in the context of two different attribution problems: (a) kindred: differentiate Langston Hughes' early poems from those of kindred poets and (b) diachronic: differentiate Hughes' early from his later poems. Using a diverse set of 535 generic text features, each categorized as poetic or nonpoetic, correlation-based greedy forward search ranked the features and a support vector machine classified the poems. A small subset of features (∼10) achieved cross-validated precision and recall as high as 87{\%}. Poetic features (rhyme patterns particularly) were nearly as effective as nonpoetic in kindred discrimination, but less effective diachronically. In other words, Hughes used both poetic and nonpoetic features in distinctive ways and his use of nonpoetic features evolved systematically while he continued to experiment with poetic features. These findings affirm qualitative studies attesting to structural elements from Black oral tradition and Black folk music (blues) and to the internal consistency of Hughes' early poetry.",
keywords = "computational linguistics, machine learning, natural language processing",
author = "Fegley, {Brent D.} and Torvik, {Vetle Ingvald}",
year = "2012",
month = "11",
day = "1",
doi = "10.1002/asi.22727",
language = "English (US)",
volume = "63",
pages = "2165--2181",
journal = "Journal of the Association for Information Science and Technology",
issn = "2330-1635",
publisher = "John Wiley and Sons Ltd",
number = "11",

}

TY - JOUR

T1 - On the role of poetic versus nonpoetic features in "kindred" and diachronic poetry attribution

AU - Fegley, Brent D.

AU - Torvik, Vetle Ingvald

PY - 2012/11/1

Y1 - 2012/11/1

N2 - Author attribution studies have demonstrated remarkable success in applying orthographic and lexicographic features of text in a variety of discrimination problems. What might poetic features, such as syllabic stress and mood, contribute? We address this question in the context of two different attribution problems: (a) kindred: differentiate Langston Hughes' early poems from those of kindred poets and (b) diachronic: differentiate Hughes' early from his later poems. Using a diverse set of 535 generic text features, each categorized as poetic or nonpoetic, correlation-based greedy forward search ranked the features and a support vector machine classified the poems. A small subset of features (∼10) achieved cross-validated precision and recall as high as 87%. Poetic features (rhyme patterns particularly) were nearly as effective as nonpoetic in kindred discrimination, but less effective diachronically. In other words, Hughes used both poetic and nonpoetic features in distinctive ways and his use of nonpoetic features evolved systematically while he continued to experiment with poetic features. These findings affirm qualitative studies attesting to structural elements from Black oral tradition and Black folk music (blues) and to the internal consistency of Hughes' early poetry.

AB - Author attribution studies have demonstrated remarkable success in applying orthographic and lexicographic features of text in a variety of discrimination problems. What might poetic features, such as syllabic stress and mood, contribute? We address this question in the context of two different attribution problems: (a) kindred: differentiate Langston Hughes' early poems from those of kindred poets and (b) diachronic: differentiate Hughes' early from his later poems. Using a diverse set of 535 generic text features, each categorized as poetic or nonpoetic, correlation-based greedy forward search ranked the features and a support vector machine classified the poems. A small subset of features (∼10) achieved cross-validated precision and recall as high as 87%. Poetic features (rhyme patterns particularly) were nearly as effective as nonpoetic in kindred discrimination, but less effective diachronically. In other words, Hughes used both poetic and nonpoetic features in distinctive ways and his use of nonpoetic features evolved systematically while he continued to experiment with poetic features. These findings affirm qualitative studies attesting to structural elements from Black oral tradition and Black folk music (blues) and to the internal consistency of Hughes' early poetry.

KW - computational linguistics

KW - machine learning

KW - natural language processing

UR - http://www.scopus.com/inward/record.url?scp=84868122494&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84868122494&partnerID=8YFLogxK

U2 - 10.1002/asi.22727

DO - 10.1002/asi.22727

M3 - Article

AN - SCOPUS:84868122494

VL - 63

SP - 2165

EP - 2181

JO - Journal of the Association for Information Science and Technology

JF - Journal of the Association for Information Science and Technology

SN - 2330-1635

IS - 11

ER -