Abstract
This paper proposes a new training framework for mixed labeled and unlabeled data and evaluates it on the task of binary phonetic classification. Our training objective function combines Maximum Mutual Information (MMI) for labeled data and Maximum Likelihood (ML) for unlabeled data. Through the modified training objective, MMI estimates are smoothed with ML estimates obtained from unlabeled data. On the other hand, our training criterion can also help the existing model adapt to new speech characteristics from unlabeled speech. In our experiments of phonetic classification, there is a consistent reduction of error rate from MLE to MMIE with I-smoothing, and then to MMIE with unlabeled-smoothing. Error rates can be further reduced by transductive-MMIE. We also experimented with the gender-mismatched case, in which the best improvement shows MMIE with unlabeled data has a 9.3% absolute lower error rate than MLE and a 2.35% absolute lower error rate than MMIE with I-smoothing.
Original language | English (US) |
---|---|
Pages (from-to) | 952-955 |
Number of pages | 4 |
Journal | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH |
State | Published - 2008 |
Event | INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia Duration: Sep 22 2008 → Sep 26 2008 |
Keywords
- Gaussian mixture models
- Maximum mutual information
- Unlabeled speech
ASJC Scopus subject areas
- Human-Computer Interaction
- Signal Processing
- Software
- Sensory Systems