TY - GEN
T1 - Long-range prosody prediction and rhythm
AU - Kochanski, Greg
AU - Loukina, Anastassia
AU - Keane, Elinor
AU - Shih, Chilin
AU - Rosner, Burton
N1 - We thank John Coleman, Anne Cutler, and Daniel Hirst for comments and questions. This project is supported by the Economic and Social Research Council (UK) via RES-062-23-1323, and we acknowledge the National Science Foundation (US) for supporting Dr. Shih via IIS-0623805 and IIS-0534133.
PY - 2010
Y1 - 2010
N2 - Rhythm is expressed by recurring, hence predictable, beat patterns. Poetry in many languages is composed with attention to poetic meters while prose is not. Therefore, one way to investigate speech rhythm is to evaluate how prose reading differs from poetry reading via a quantitative method that measures predictability. We use linear regression to predict the acoustic properties of segments from the properties of up to 7 preceding segments. This explains as much as 41% of the variance in our full (prose) corpus and up to 79% in a sub-corpus of poetry. While roughly half of the predictive power comes from the segment immediately preceding the target, the predicted variance increases by 6% (for the full/prose corpus) or by 25% (for the poetry sub-corpus) upon extending the predictor to include the seven preceding segments. Therefore, interactions between segments extend well beyond the immediate vicinity. Potentially, these longer-range regressions capture the rhythms of the poetry. This approach could form a useful method for characterizing the statistical properties of spoken language, especially in reference to prosody and speech rhythm.
AB - Rhythm is expressed by recurring, hence predictable, beat patterns. Poetry in many languages is composed with attention to poetic meters while prose is not. Therefore, one way to investigate speech rhythm is to evaluate how prose reading differs from poetry reading via a quantitative method that measures predictability. We use linear regression to predict the acoustic properties of segments from the properties of up to 7 preceding segments. This explains as much as 41% of the variance in our full (prose) corpus and up to 79% in a sub-corpus of poetry. While roughly half of the predictive power comes from the segment immediately preceding the target, the predicted variance increases by 6% (for the full/prose corpus) or by 25% (for the poetry sub-corpus) upon extending the predictor to include the seven preceding segments. Therefore, interactions between segments extend well beyond the immediate vicinity. Potentially, these longer-range regressions capture the rhythms of the poetry. This approach could form a useful method for characterizing the statistical properties of spoken language, especially in reference to prosody and speech rhythm.
KW - Poetry
KW - Prediction
KW - Prosody
KW - Rhythm
KW - Syllable
UR - https://www.scopus.com/pages/publications/79959591281
UR - https://www.scopus.com/pages/publications/79959591281#tab=citedBy
M3 - Conference contribution
AN - SCOPUS:79959591281
T3 - Proceedings of the International Conference on Speech Prosody
BT - 5th International Conference on Speech Prosody 2010
PB - International Speech Communication Association
T2 - 5th International Conference on Speech Prosody: Every Language, Every Style, SP 2010
Y2 - 10 May 2010 through 14 May 2010
ER -