A factorial HMM approach to simultaneous recognition of isolated digits spoken by multiple talkers on one audio channel

Ameya Nitin Deoras, Mark Allan Hasegawa-Johnson

Research output: Contribution to journalArticle

Abstract

This paper addresses the novel problem of recognizing digits spoken simultaneously by two different talkers. A Factorial Hidden Markov Model architecture is proposed to accurately model the simultaneous utterance of two digits. Nadas' MIXMAX approximation is extended to a mixture of Gaussians observation PDF which enables the implementation of the proposed system. The multiple digit recognizer is found to successfully recognize pairs of simultaneous utterances of digits at 0db SNR with up to 89% accuracy.

Original languageEnglish (US)
JournalICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume1
StatePublished - 2004

Fingerprint

digits
Hidden Markov models
approximation

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing
  • Acoustics and Ultrasonics

Cite this

@article{139e397c8e3c4a55bd0ee8d1a4b03cff,
title = "A factorial HMM approach to simultaneous recognition of isolated digits spoken by multiple talkers on one audio channel",
abstract = "This paper addresses the novel problem of recognizing digits spoken simultaneously by two different talkers. A Factorial Hidden Markov Model architecture is proposed to accurately model the simultaneous utterance of two digits. Nadas' MIXMAX approximation is extended to a mixture of Gaussians observation PDF which enables the implementation of the proposed system. The multiple digit recognizer is found to successfully recognize pairs of simultaneous utterances of digits at 0db SNR with up to 89{\%} accuracy.",
author = "Deoras, {Ameya Nitin} and Hasegawa-Johnson, {Mark Allan}",
year = "2004",
language = "English (US)",
volume = "1",
journal = "Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing",
issn = "0736-7791",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - A factorial HMM approach to simultaneous recognition of isolated digits spoken by multiple talkers on one audio channel

AU - Deoras, Ameya Nitin

AU - Hasegawa-Johnson, Mark Allan

PY - 2004

Y1 - 2004

N2 - This paper addresses the novel problem of recognizing digits spoken simultaneously by two different talkers. A Factorial Hidden Markov Model architecture is proposed to accurately model the simultaneous utterance of two digits. Nadas' MIXMAX approximation is extended to a mixture of Gaussians observation PDF which enables the implementation of the proposed system. The multiple digit recognizer is found to successfully recognize pairs of simultaneous utterances of digits at 0db SNR with up to 89% accuracy.

AB - This paper addresses the novel problem of recognizing digits spoken simultaneously by two different talkers. A Factorial Hidden Markov Model architecture is proposed to accurately model the simultaneous utterance of two digits. Nadas' MIXMAX approximation is extended to a mixture of Gaussians observation PDF which enables the implementation of the proposed system. The multiple digit recognizer is found to successfully recognize pairs of simultaneous utterances of digits at 0db SNR with up to 89% accuracy.

UR - http://www.scopus.com/inward/record.url?scp=4544369701&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=4544369701&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:4544369701

VL - 1

JO - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

JF - Proceedings - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing

SN - 0736-7791

ER -