An iterative approach to decision tree training for context dependent speech synthesis

Xiayu Chen, Yang Zhang, Mark Allan Hasegawa-Johnson

Research output: Contribution to journalArticle

Abstract

EDHMM with decision trees is a popular model for parametric speech synthesis. Traditional training procedure constructs the decision trees after observation probability densities have been optimized with the EM algorithm, assuming the state assignment probability does not change much during tree construction. This paper proposes an iterative algorithm that removes the assumption. In the new algorithm, the decision tree construction is incorporated into the EM iteration, with a safeguard procedure that ensures convergence. Evaluation on The Boston University Radio Speech corpus shows that the proposed algorithm can achieve a significantly better optimum in the training set than the original one, and that the advantage is well generalizable to the test set.

Original languageEnglish (US)
Pages (from-to)2327-2331
Number of pages5
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - 2014

Keywords

  • Decision tree
  • EM algorithm
  • Speech clustering
  • Speech synthesis

ASJC Scopus subject areas

  • Language and Linguistics
  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Modeling and Simulation

Fingerprint Dive into the research topics of 'An iterative approach to decision tree training for context dependent speech synthesis'. Together they form a unique fingerprint.

  • Cite this