Rhythm measures and dimensions of durational variation in speech

Anastassia Loukina, Greg Kochanski, Burton Rosner, Elinor Keane, Chilin Shih

Research output: Contribution to journalArticle

Abstract

Patterns of durational variation were examined by applying 15 previously published rhythm measures to a large corpus of speech from five languages. In order to achieve consistent segmentation across all languages, an automatic speech recognition system was developed to divide the waveforms into consonantal and vocalic regions. The resulting duration measurements rest strictly on acoustic criteria. Machine classification showed that rhythm measures could separate languages at rates above chance. Within-language variability in rhythm measures, however, was large and comparable to that between languages. Therefore, different languages could not be identified reliably from single paragraphs. In experiments separating pairs of languages, a rhythm measure that was relatively successful at separating one pair often performed very poorly on another pair: there was no broadly successful rhythm measure. Separation of all five languages at once required a combination of three rhythm measures. Many triplets were about equally effective, but the confusion patterns between languages varied with the choice of rhythm measures.

Original languageEnglish (US)
Pages (from-to)3258-3270
Number of pages13
JournalJournal of the Acoustical Society of America
Volume129
Issue number5
DOIs
StatePublished - May 1 2011

Fingerprint

rhythm
Rhythm
Language
confusion
speech recognition
waveforms
acoustics

ASJC Scopus subject areas

  • Arts and Humanities (miscellaneous)
  • Acoustics and Ultrasonics

Cite this

Rhythm measures and dimensions of durational variation in speech. / Loukina, Anastassia; Kochanski, Greg; Rosner, Burton; Keane, Elinor; Shih, Chilin.

In: Journal of the Acoustical Society of America, Vol. 129, No. 5, 01.05.2011, p. 3258-3270.

Research output: Contribution to journalArticle

Loukina, Anastassia ; Kochanski, Greg ; Rosner, Burton ; Keane, Elinor ; Shih, Chilin. / Rhythm measures and dimensions of durational variation in speech. In: Journal of the Acoustical Society of America. 2011 ; Vol. 129, No. 5. pp. 3258-3270.
@article{98ab0b587ed149438beaeecdf33db826,
title = "Rhythm measures and dimensions of durational variation in speech",
abstract = "Patterns of durational variation were examined by applying 15 previously published rhythm measures to a large corpus of speech from five languages. In order to achieve consistent segmentation across all languages, an automatic speech recognition system was developed to divide the waveforms into consonantal and vocalic regions. The resulting duration measurements rest strictly on acoustic criteria. Machine classification showed that rhythm measures could separate languages at rates above chance. Within-language variability in rhythm measures, however, was large and comparable to that between languages. Therefore, different languages could not be identified reliably from single paragraphs. In experiments separating pairs of languages, a rhythm measure that was relatively successful at separating one pair often performed very poorly on another pair: there was no broadly successful rhythm measure. Separation of all five languages at once required a combination of three rhythm measures. Many triplets were about equally effective, but the confusion patterns between languages varied with the choice of rhythm measures.",
author = "Anastassia Loukina and Greg Kochanski and Burton Rosner and Elinor Keane and Chilin Shih",
year = "2011",
month = "5",
day = "1",
doi = "10.1121/1.3559709",
language = "English (US)",
volume = "129",
pages = "3258--3270",
journal = "Journal of the Acoustical Society of America",
issn = "0001-4966",
publisher = "Acoustical Society of America",
number = "5",

}

TY - JOUR

T1 - Rhythm measures and dimensions of durational variation in speech

AU - Loukina, Anastassia

AU - Kochanski, Greg

AU - Rosner, Burton

AU - Keane, Elinor

AU - Shih, Chilin

PY - 2011/5/1

Y1 - 2011/5/1

N2 - Patterns of durational variation were examined by applying 15 previously published rhythm measures to a large corpus of speech from five languages. In order to achieve consistent segmentation across all languages, an automatic speech recognition system was developed to divide the waveforms into consonantal and vocalic regions. The resulting duration measurements rest strictly on acoustic criteria. Machine classification showed that rhythm measures could separate languages at rates above chance. Within-language variability in rhythm measures, however, was large and comparable to that between languages. Therefore, different languages could not be identified reliably from single paragraphs. In experiments separating pairs of languages, a rhythm measure that was relatively successful at separating one pair often performed very poorly on another pair: there was no broadly successful rhythm measure. Separation of all five languages at once required a combination of three rhythm measures. Many triplets were about equally effective, but the confusion patterns between languages varied with the choice of rhythm measures.

AB - Patterns of durational variation were examined by applying 15 previously published rhythm measures to a large corpus of speech from five languages. In order to achieve consistent segmentation across all languages, an automatic speech recognition system was developed to divide the waveforms into consonantal and vocalic regions. The resulting duration measurements rest strictly on acoustic criteria. Machine classification showed that rhythm measures could separate languages at rates above chance. Within-language variability in rhythm measures, however, was large and comparable to that between languages. Therefore, different languages could not be identified reliably from single paragraphs. In experiments separating pairs of languages, a rhythm measure that was relatively successful at separating one pair often performed very poorly on another pair: there was no broadly successful rhythm measure. Separation of all five languages at once required a combination of three rhythm measures. Many triplets were about equally effective, but the confusion patterns between languages varied with the choice of rhythm measures.

UR - http://www.scopus.com/inward/record.url?scp=79959612191&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=79959612191&partnerID=8YFLogxK

U2 - 10.1121/1.3559709

DO - 10.1121/1.3559709

M3 - Article

C2 - 21568427

AN - SCOPUS:79959612191

VL - 129

SP - 3258

EP - 3270

JO - Journal of the Acoustical Society of America

JF - Journal of the Acoustical Society of America

SN - 0001-4966

IS - 5

ER -