Abstract
This paper presents a mathematical description of style in speech and singing. These styles are represented as a set of portable prosodic features along with a set of rules to choose where the features are to be applied. Speakers and singers make creative choices to express their personal style, which may involve specific phrase curve, accent shape, or, similarly, musical embellishment. Therefore a quantitative model of style needs to support unconstrained accent and phrase curve description, and to solve potential conflicts that arise from this freedom. Our current implementation modifies two acoustic parameters: f0 and amplitude. We use an articulator-based model, Stem-ML, to resolve conflicts between intended accents or embellishments and their environment. We present several examples to illustrate the modeling of accents and phrase curves, as well as the usefulness of style/content separation, and the similarity between speech and music.
Original language | English (US) |
---|---|
Pages (from-to) | 393-408 |
Number of pages | 16 |
Journal | International Journal of Speech Technology |
Volume | 6 |
Issue number | 4 |
DOIs | |
State | Published - Oct 2003 |
Externally published | Yes |
Keywords
- Accent
- Music
- Pitch
- Quantitative models
- Speech synthesis
- Stem-ML
ASJC Scopus subject areas
- Software
- Language and Linguistics
- Human-Computer Interaction
- Linguistics and Language
- Computer Vision and Pattern Recognition