Gaussian mixture models of phonetic boundaries for speech recognition

M. K. Omar, M. Hasegawa-Johnson, S. Levinson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A new approach to represent temporal correlation in an automatic speech recognition system is described. It introduces an acoustic feature set that captures the dynamics of a speech signal at the phoneme boundaries in combination with the traditional acoustic feature set representing the periods that are assumed to be quasi-stationary of speech. This newly introduced feature set represents an observed random vector associated with the state transition in HMM. For the same complexity and number of parameters, this approach improves the phoneme recognition accuracy by 3.5% compared to the context-independent HMM models. Stop consonant recognition accuracy is increased by 40%.

Original languageEnglish (US)
Title of host publication2001 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001 - Conference Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages33-36
Number of pages4
ISBN (Electronic)078037343X, 9780780373433
DOIs
StatePublished - 2001
EventIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001 - Madonna di Campiglio, Italy
Duration: Dec 9 2001Dec 13 2001

Publication series

Name2001 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001 - Conference Proceedings

Other

OtherIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2001
Country/TerritoryItaly
CityMadonna di Campiglio
Period12/9/0112/13/01

ASJC Scopus subject areas

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Gaussian mixture models of phonetic boundaries for speech recognition'. Together they form a unique fingerprint.

Cite this