TY - GEN
T1 - Generating small, accurate acoustic models with a modified bayesian information criterion
AU - Yu, Kai
AU - Rutenbar, Robin A
PY - 2007
Y1 - 2007
N2 - Although Gaussian mixture models are commonly used in acoustic models for speech recognition, there is no standard method for determining the number of mixture components. Most models arbitrarily assign the number of mixture components with little justification. While model selection techniques with a mathematical derivation, such as the Bayesian information criterion (BIC), have been applied, these criteria focus on properly modeling the true distribution of individual tied-states (senones) without considering the entire acoustic model; this leads to suboptimal speech recognition performance. In this paper we present a method to generate statistically-justified acoustic models that consider inter-senone effects by modifying the BIC. Experimental results in the CMU Communicator domain show that in contrast to previous strategies, the new method generates not only attractively smaller acoustic models, but also ones with lower word error rate.
AB - Although Gaussian mixture models are commonly used in acoustic models for speech recognition, there is no standard method for determining the number of mixture components. Most models arbitrarily assign the number of mixture components with little justification. While model selection techniques with a mathematical derivation, such as the Bayesian information criterion (BIC), have been applied, these criteria focus on properly modeling the true distribution of individual tied-states (senones) without considering the entire acoustic model; this leads to suboptimal speech recognition performance. In this paper we present a method to generate statistically-justified acoustic models that consider inter-senone effects by modifying the BIC. Experimental results in the CMU Communicator domain show that in contrast to previous strategies, the new method generates not only attractively smaller acoustic models, but also ones with lower word error rate.
KW - Acoustic model training
KW - BIC
KW - Gaussian mixture models
KW - Model selection
UR - http://www.scopus.com/inward/record.url?scp=56149090486&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=56149090486&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:56149090486
SN - 9781605603162
T3 - Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
SP - 1165
EP - 1168
BT - International Speech Communication Association - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
T2 - 8th Annual Conference of the International Speech Communication Association, Interspeech 2007
Y2 - 27 August 2007 through 31 August 2007
ER -