TY - GEN
T1 - Bayesian learning for models of human speech perception
AU - Hasegawa-Johnson, M.
N1 - Publisher Copyright:
© 2003 IEEE.
PY - 2003
Y1 - 2003
N2 - Human speech recognition error rates are 30 times lower than machine error rates. Psychophysical experiments have pinpointed a number of specific human behaviors that may contribute to accurate speech recognition, but previous attempts to incorporate such behaviors into automatic speech recognition have often failed because the resulting models could not be easily trained from data. This paper describes Bayesian learning methods for computational models for human speech perception. Specifically, the linked computational models proposed in this paper seek to imitate the following human behaviors: independence of distinctive feature errors, perceptual magnet effect, the vowel sequence illusion, sensitivity to energy onsets and offsets, and redundant use of asynchronous acoustic correlates. The proposed models differ from many previous computational psychological models in that the desired behavior is learned from data, using a constrained optimization algorithm (the EM algorithm), rather than being coded into the model as a series of fixed rules.
AB - Human speech recognition error rates are 30 times lower than machine error rates. Psychophysical experiments have pinpointed a number of specific human behaviors that may contribute to accurate speech recognition, but previous attempts to incorporate such behaviors into automatic speech recognition have often failed because the resulting models could not be easily trained from data. This paper describes Bayesian learning methods for computational models for human speech perception. Specifically, the linked computational models proposed in this paper seek to imitate the following human behaviors: independence of distinctive feature errors, perceptual magnet effect, the vowel sequence illusion, sensitivity to energy onsets and offsets, and redundant use of asynchronous acoustic correlates. The proposed models differ from many previous computational psychological models in that the desired behavior is learned from data, using a constrained optimization algorithm (the EM algorithm), rather than being coded into the model as a series of fixed rules.
KW - Automatic speech recognition
KW - Bayesian methods
KW - Computational modeling
KW - Error analysis
KW - Humans
KW - Mathematical model
KW - Psychology
KW - Signal processing algorithms
KW - Speech processing
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=84948692850&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84948692850&partnerID=8YFLogxK
U2 - 10.1109/SSP.2003.1289432
DO - 10.1109/SSP.2003.1289432
M3 - Conference contribution
AN - SCOPUS:84948692850
T3 - IEEE Workshop on Statistical Signal Processing Proceedings
SP - 408
EP - 411
BT - Proceedings of the 2003 IEEE Workshop on Statistical Signal Processing, SSP 2003
PB - IEEE Computer Society
T2 - IEEE Workshop on Statistical Signal Processing, SSP 2003
Y2 - 28 September 2003 through 1 October 2003
ER -