Maximum mutual information estimation with unlabeled data for phonetic classification

Jui Ting Huang, Mark Hasegawa-Johnson

Research output: Contribution to journalConference article

Abstract

This paper proposes a new training framework for mixed labeled and unlabeled data and evaluates it on the task of binary phonetic classification. Our training objective function combines Maximum Mutual Information (MMI) for labeled data and Maximum Likelihood (ML) for unlabeled data. Through the modified training objective, MMI estimates are smoothed with ML estimates obtained from unlabeled data. On the other hand, our training criterion can also help the existing model adapt to new speech characteristics from unlabeled speech. In our experiments of phonetic classification, there is a consistent reduction of error rate from MLE to MMIE with I-smoothing, and then to MMIE with unlabeled-smoothing. Error rates can be further reduced by transductive-MMIE. We also experimented with the gender-mismatched case, in which the best improvement shows MMIE with unlabeled data has a 9.3% absolute lower error rate than MLE and a 2.35% absolute lower error rate than MMIE with I-smoothing.

Original languageEnglish (US)
Pages (from-to)952-955
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - Dec 1 2008
EventINTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia
Duration: Sep 22 2008Sep 26 2008

Keywords

  • Gaussian mixture models
  • Maximum mutual information
  • Unlabeled speech

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Signal Processing
  • Software
  • Sensory Systems

Fingerprint Dive into the research topics of 'Maximum mutual information estimation with unlabeled data for phonetic classification'. Together they form a unique fingerprint.

  • Cite this