On Semi-Supervised Learning of Gaussian Mixture Models for Phonetic Classification

Jui Ting Huang, Mark Hasegawa-Johnson

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper investigates semi-supervised learning of Gaussian mixture models using an unified objective function taking both labeled and unlabeled data into account. Two methods are compared in this work – the hybrid discriminative/generative method and the purely generative method. They differ in the criterion type on labeled data; the hybrid method uses the class posterior probabilities and the purely generative method uses the data likelihood. We conducted experiments on the TIMIT database and a standard synthetic data set from UCI Machine Learning repository. The results show that the two methods behave similarly in various conditions. For both methods, unlabeled data improve training on models of higher complexity in which the supervised method performs poorly. In addition, there is a trend that more unlabeled data results in more improvement in classification accuracy over the supervised model. We also provided experimental observations on the relative weights of labeled and unlabeled parts of the training objective and suggested a critical value which could be useful for selecting a good weighing factor.

Original languageEnglish (US)
Title of host publicationNAACL HLT 2009 - Semi-Supervised Learning for Natural Language Processing, Proceedings of the Workshop
EditorsQin Iris Wang, Kevin Duh, Dekang Lin
PublisherAssociation for Computational Linguistics (ACL)
Pages75-83
Number of pages9
ISBN (Electronic)9781932432381
StatePublished - 2009
Event2009 Semi-Supervised Learning for Natural Language Processing, SSL-NLP2009 - Boulder, United States
Duration: Jun 4 2009 → …

Publication series

NameNAACL HLT 2009 - Semi-Supervised Learning for Natural Language Processing, Proceedings of the Workshop

Conference

Conference2009 Semi-Supervised Learning for Natural Language Processing, SSL-NLP2009
Country/TerritoryUnited States
CityBoulder
Period6/4/09 → …

ASJC Scopus subject areas

  • Computer Science Applications
  • Software
  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'On Semi-Supervised Learning of Gaussian Mixture Models for Phonetic Classification'. Together they form a unique fingerprint.

Cite this