Articulatory speech synthesis from the fluid Dynamics of the vocal apparatus

Stephen Levinson, Don Davis, Scot Slimon, Jun Huang

Research output: Contribution to journalArticlepeer-review

Abstract

This book addresses the problem of articulatory speech synthesis based on computed vocal tract geometries and the basic physics of sound production in it. Unlike conventional methods based on analysis/synthesis using the well-known source filter model, which assumes the independence of the excitation and filter,we treat the entire vocal apparatus as one mechanical system that produces sound by means of fluid dynamics.The vocal apparatus is represented as a three-dimensional time-varying mechanism and the sound propagation inside it is due to the non-planar propagation of acoustic waves through a viscous, compressible fluid described by the Navier-Stokes equations. We propose a combined minimum energy and minimum jerk criterion to compute the dynamics of the vocal tract during articulation.Theoretical error bounds and experimental results show that this method obtains a close match to the phonetic target positions while avoiding abrupt changes in the articulatory trajectory. The vocal folds are set into aerodynamic oscillation by the flow of air from the lungs. The modulated air stream then excites the moving vocal tract. This method shows strong evidence for source-filter interaction. Based on our results, we propose that the articulatory speech production model has the potential to synthesize speech and provide a compact parameterization of the speech signal that can be useful in a wide variety of speech signal processing problems.

Original languageEnglish (US)
Pages (from-to)1-118
Number of pages118
JournalSynthesis Lectures on Speech and Audio Processing
Volume9
DOIs
StatePublished - Jul 19 2012

Keywords

  • Navier-Stokes equations
  • articulatory dynamics
  • articulatory speech synthesis
  • computational fluid dynamics
  • human vocal apparatus

ASJC Scopus subject areas

  • Signal Processing
  • Acoustics and Ultrasonics
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Articulatory speech synthesis from the fluid Dynamics of the vocal apparatus'. Together they form a unique fingerprint.

Cite this