Abstract
This paper presents a novel solution to the problem of isolated digit recognition in background music. A Factorial Hidden Markov Model (FHMM) architecture is proposed to accurately model the simultaneous occurrence of two independent processes, such as an utterance of a digit and an excerpt of music. The FHMM is implemented with its equivalent HMM by extending Nadas' MIXMAX algorithm to a mixture of Gaussians PDF. At around 0 dB SNR, the proposed system shows an average relative reduction in word error rate of 57% in the recognition of isolated digits in background music.
Original language | English (US) |
---|---|
Pages | 2093-2096 |
Number of pages | 4 |
State | Published - 2004 |
Event | 8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of Duration: Oct 4 2004 → Oct 8 2004 |
Other
Other | 8th International Conference on Spoken Language Processing, ICSLP 2004 |
---|---|
Country/Territory | Korea, Republic of |
City | Jeju, Jeju Island |
Period | 10/4/04 → 10/8/04 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language