Abstract
A theoretical basis for optimal multichannel speech enhancementis presented, sufficient, flexible to be used with any assumed statistical model and optimality criterion. Any Bayesian optimal one-channel estimator for speech enhancement can be generalized to the multi-channel case as a sequentially constructed minimum variance distortionless response (MVDR) beamformer followed by an optimal one-channel postfilter. We present experimental results using the minimum mean-square error log-spectral amplitude (MMSE-logSA) optimality criterion, applied to a statistical model with simplified channel but realistic inter-microphone noise coherence. Word error rate in the audio-visual speech in a car (AVICAR) corpus (moving car. windows open) is reduced from 18% to 9%.
Original language | English (US) |
---|---|
Title of host publication | 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings |
Volume | 3 |
State | Published - 2006 |
Event | 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006 - Toulouse, France Duration: May 14 2006 → May 19 2006 |
Other
Other | 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006 |
---|---|
Country/Territory | France |
City | Toulouse |
Period | 5/14/06 → 5/19/06 |
ASJC Scopus subject areas
- Software
- Signal Processing
- Electrical and Electronic Engineering