Prosody modeling with soft templates

Greg Kochanski, Chilin Shih

Research output: Contribution to journalArticlepeer-review

Abstract

This paper describes a novel prosody generation model. We intend it to broadly support many linguistic theories and multiple languages, for the model imposes no restriction on accent categories and shapes. This capability is crucial to the next generation of text-to-speech systems that will need to synthesize intonation variations for different speech acts, emotions, and styles of speech. The system supports mark-up tags that are mathematically defined and generate f0 deterministically. Underlying the tags is an articulatory model of accent interaction which balances physiological and communication constraints. We specify the model by way of an algorithm for calculating the pitch, and by way of examples. The model allows localized, linguistically reasonable tags, and is suitable for a data-driven fitting process.

Original languageEnglish (US)
Pages (from-to)311-352
Number of pages42
JournalSpeech Communication
Volume39
Issue number3-4
DOIs
StatePublished - Feb 2003
Externally publishedYes

Keywords

  • Algorithm
  • Communication
  • Computer language
  • Dynamics
  • Intonation
  • Mark-up language
  • Modeling
  • Physiology
  • Pitch
  • Prosody
  • Speech
  • Text-to-speech
  • Tone
  • XML

ASJC Scopus subject areas

  • Software
  • Modeling and Simulation
  • Communication
  • Language and Linguistics
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint

Dive into the research topics of 'Prosody modeling with soft templates'. Together they form a unique fingerprint.

Cite this