A factorial HMM approach to robust isolated digit recognition in background music

Ameya Nitin Deoras, Mark Hasegawa-Johnson

Research output: Contribution to conferencePaper

Abstract

This paper presents a novel solution to the problem of isolated digit recognition in background music. A Factorial Hidden Markov Model (FHMM) architecture is proposed to accurately model the simultaneous occurrence of two independent processes, such as an utterance of a digit and an excerpt of music. The FHMM is implemented with its equivalent HMM by extending Nadas' MIXMAX algorithm to a mixture of Gaussians PDF. At around 0 dB SNR, the proposed system shows an average relative reduction in word error rate of 57% in the recognition of isolated digits in background music.

Original languageEnglish (US)
Pages2093-2096
Number of pages4
StatePublished - Jan 1 2004
Event8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of
Duration: Oct 4 2004Oct 8 2004

Other

Other8th International Conference on Spoken Language Processing, ICSLP 2004
CountryKorea, Republic of
CityJeju, Jeju Island
Period10/4/0410/8/04

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language

Fingerprint Dive into the research topics of 'A factorial HMM approach to robust isolated digit recognition in background music'. Together they form a unique fingerprint.

  • Cite this

    Deoras, A. N., & Hasegawa-Johnson, M. (2004). A factorial HMM approach to robust isolated digit recognition in background music. 2093-2096. Paper presented at 8th International Conference on Spoken Language Processing, ICSLP 2004, Jeju, Jeju Island, Korea, Republic of.