Modeling tones in continuous Cantonese speech

Tan Lee, Greg Kochanski, Chilin Shih, Yujia Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Cantonese is a major Chinese dialect with a complicated tone system. This research focuses on quantitative modeling of Cantonese tones. It uses Stem-ML, a language-independent framework for quantitative intonation modeling and generation. A set of F0 prediction models are built, and trained on acoustic data. The prediction error is about 11 Hz or 1 semitone. The resulting optimal model parameters are analyzed in accordance with linguistic knowledge. Key observations include: (1) There is no obvious advantage to model the entering tones separately. They can be considered as simply truncated versions of the nonentering tones; (2) Cantonese appears to have a declining phrase intonation; (3) Tones at initial positions of a phrase or a sentence tend to have a greater prosodic strength than those at the final positions; (4) Content words are stronger than function words; (5) Long words are stronger than short words.

Original languageEnglish (US)
Title of host publication7th International Conference on Spoken Language Processing, ICSLP 2002
PublisherInternational Speech Communication Association
Number of pages4
StatePublished - 2002
Externally publishedYes
Event7th International Conference on Spoken Language Processing, ICSLP 2002 - Denver, United States
Duration: Sep 16 2002Sep 20 2002


Other7th International Conference on Spoken Language Processing, ICSLP 2002
Country/TerritoryUnited States

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language


Dive into the research topics of 'Modeling tones in continuous Cantonese speech'. Together they form a unique fingerprint.

Cite this