Abstract

This paper presents a novel fused hidden Markov model (fused HMM) for integrating tightly coupled time series, such as audio and visual features of speech. In this model, the time series are first modeled by two conventional HMMs separately. The resulting HMMs are then fused together using a probabilistic fusion model, which is optimal according to the maximum entropy principle and a maximum mutual information criterion. Simulations and bimodal speaker verification experiments show that the proposed model can significantly reduce the recognition errors in noiseless or noisy environments.

Original languageEnglish (US)
Pages (from-to)573-581
Number of pages9
JournalIEEE Transactions on Signal Processing
Volume52
Issue number3
DOIs
StatePublished - Mar 1 2004

Fingerprint

Speech processing
Hidden Markov models
Time series
Entropy
Fusion reactions
Experiments

Keywords

  • Bimodal speech processing
  • Hidden Markov model
  • Information fusion

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing

Cite this

@article{41121a88ac544f53a92b2e93b09320ca,
title = "A Fused Hidden Markov Model with Application to Bimodal Speech Processing",
abstract = "This paper presents a novel fused hidden Markov model (fused HMM) for integrating tightly coupled time series, such as audio and visual features of speech. In this model, the time series are first modeled by two conventional HMMs separately. The resulting HMMs are then fused together using a probabilistic fusion model, which is optimal according to the maximum entropy principle and a maximum mutual information criterion. Simulations and bimodal speaker verification experiments show that the proposed model can significantly reduce the recognition errors in noiseless or noisy environments.",
keywords = "Bimodal speech processing, Hidden Markov model, Information fusion",
author = "Hao Pan and Levinson, {Stephen E} and Huang, {Thomas S} and Zhi-Pei Liang",
year = "2004",
month = "3",
day = "1",
doi = "10.1109/TSP.2003.822353",
language = "English (US)",
volume = "52",
pages = "573--581",
journal = "IEEE Transactions on Signal Processing",
issn = "1053-587X",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "3",

}

TY - JOUR

T1 - A Fused Hidden Markov Model with Application to Bimodal Speech Processing

AU - Pan, Hao

AU - Levinson, Stephen E

AU - Huang, Thomas S

AU - Liang, Zhi-Pei

PY - 2004/3/1

Y1 - 2004/3/1

N2 - This paper presents a novel fused hidden Markov model (fused HMM) for integrating tightly coupled time series, such as audio and visual features of speech. In this model, the time series are first modeled by two conventional HMMs separately. The resulting HMMs are then fused together using a probabilistic fusion model, which is optimal according to the maximum entropy principle and a maximum mutual information criterion. Simulations and bimodal speaker verification experiments show that the proposed model can significantly reduce the recognition errors in noiseless or noisy environments.

AB - This paper presents a novel fused hidden Markov model (fused HMM) for integrating tightly coupled time series, such as audio and visual features of speech. In this model, the time series are first modeled by two conventional HMMs separately. The resulting HMMs are then fused together using a probabilistic fusion model, which is optimal according to the maximum entropy principle and a maximum mutual information criterion. Simulations and bimodal speaker verification experiments show that the proposed model can significantly reduce the recognition errors in noiseless or noisy environments.

KW - Bimodal speech processing

KW - Hidden Markov model

KW - Information fusion

UR - http://www.scopus.com/inward/record.url?scp=1542303714&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=1542303714&partnerID=8YFLogxK

U2 - 10.1109/TSP.2003.822353

DO - 10.1109/TSP.2003.822353

M3 - Article

AN - SCOPUS:1542303714

VL - 52

SP - 573

EP - 581

JO - IEEE Transactions on Signal Processing

JF - IEEE Transactions on Signal Processing

SN - 1053-587X

IS - 3

ER -