Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction

Shengkui Zhao, Xiong Xiao, Zhaofeng Zhang, Thi Ngoc Tho Nguyen, Xionghu Zhong, Bo Ren, Longbiao Wang, Douglas L Jones, Eng Siong Chng, Haizhou Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a robust speech recognition system using a microphone array for the 3rd CHiME Challenge. A minimum variance distortionless response (MVDR) beamformer with adaptive microphone gains is proposed for robust beamforming. Two microphone gain estimation methods are studied using the speech-dominant time-frequency bins. A multichannel noise reduction (MCNR) postprocessing is also proposed to further reduce the interference in the MVDR processed signal. Experimental results for the ChiME-3 challenge show that both the proposed MVDR beamformer with microphone gains and the MCNR postprocessing improve the speech recognition performance significantly. With the state-of-the-art deep neural network (DNN) based acoustic model, our system achieves a word error rate (WER) of 11.67% on the real test data of the evaluation set.

Original languageEnglish (US)
Title of host publication2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages460-467
Number of pages8
ISBN (Electronic)9781479972913
DOIs
StatePublished - Feb 10 2016
EventIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Scottsdale, United States
Duration: Dec 13 2015Dec 17 2015

Publication series

Name2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

Other

OtherIEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015
CountryUnited States
CityScottsdale
Period12/13/1512/17/15

Fingerprint

Microphones
Beamforming
Noise abatement
Speech recognition
Bins
Acoustics

Keywords

  • CHiME 3
  • MVDR beamforming
  • microphone gain
  • multichannel noise reduction
  • robust speech recognition

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition

Cite this

Zhao, S., Xiao, X., Zhang, Z., Nguyen, T. N. T., Zhong, X., Ren, B., ... Li, H. (2016). Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings (pp. 460-467). [7404831] (2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ASRU.2015.7404831

Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction. / Zhao, Shengkui; Xiao, Xiong; Zhang, Zhaofeng; Nguyen, Thi Ngoc Tho; Zhong, Xionghu; Ren, Bo; Wang, Longbiao; Jones, Douglas L; Chng, Eng Siong; Li, Haizhou.

2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. p. 460-467 7404831 (2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Zhao, S, Xiao, X, Zhang, Z, Nguyen, TNT, Zhong, X, Ren, B, Wang, L, Jones, DL, Chng, ES & Li, H 2016, Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction. in 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings., 7404831, 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings, Institute of Electrical and Electronics Engineers Inc., pp. 460-467, IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015, Scottsdale, United States, 12/13/15. https://doi.org/10.1109/ASRU.2015.7404831
Zhao S, Xiao X, Zhang Z, Nguyen TNT, Zhong X, Ren B et al. Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2016. p. 460-467. 7404831. (2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings). https://doi.org/10.1109/ASRU.2015.7404831
Zhao, Shengkui ; Xiao, Xiong ; Zhang, Zhaofeng ; Nguyen, Thi Ngoc Tho ; Zhong, Xionghu ; Ren, Bo ; Wang, Longbiao ; Jones, Douglas L ; Chng, Eng Siong ; Li, Haizhou. / Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction. 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2016. pp. 460-467 (2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings).
@inproceedings{ccb4db392576427ab7185fba35682237,
title = "Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction",
abstract = "This paper presents a robust speech recognition system using a microphone array for the 3rd CHiME Challenge. A minimum variance distortionless response (MVDR) beamformer with adaptive microphone gains is proposed for robust beamforming. Two microphone gain estimation methods are studied using the speech-dominant time-frequency bins. A multichannel noise reduction (MCNR) postprocessing is also proposed to further reduce the interference in the MVDR processed signal. Experimental results for the ChiME-3 challenge show that both the proposed MVDR beamformer with microphone gains and the MCNR postprocessing improve the speech recognition performance significantly. With the state-of-the-art deep neural network (DNN) based acoustic model, our system achieves a word error rate (WER) of 11.67{\%} on the real test data of the evaluation set.",
keywords = "CHiME 3, MVDR beamforming, microphone gain, multichannel noise reduction, robust speech recognition",
author = "Shengkui Zhao and Xiong Xiao and Zhaofeng Zhang and Nguyen, {Thi Ngoc Tho} and Xionghu Zhong and Bo Ren and Longbiao Wang and Jones, {Douglas L} and Chng, {Eng Siong} and Haizhou Li",
year = "2016",
month = "2",
day = "10",
doi = "10.1109/ASRU.2015.7404831",
language = "English (US)",
series = "2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "460--467",
booktitle = "2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings",
address = "United States",

}

TY - GEN

T1 - Robust speech recognition using beamforming with adaptive microphone gains and multichannel noise reduction

AU - Zhao, Shengkui

AU - Xiao, Xiong

AU - Zhang, Zhaofeng

AU - Nguyen, Thi Ngoc Tho

AU - Zhong, Xionghu

AU - Ren, Bo

AU - Wang, Longbiao

AU - Jones, Douglas L

AU - Chng, Eng Siong

AU - Li, Haizhou

PY - 2016/2/10

Y1 - 2016/2/10

N2 - This paper presents a robust speech recognition system using a microphone array for the 3rd CHiME Challenge. A minimum variance distortionless response (MVDR) beamformer with adaptive microphone gains is proposed for robust beamforming. Two microphone gain estimation methods are studied using the speech-dominant time-frequency bins. A multichannel noise reduction (MCNR) postprocessing is also proposed to further reduce the interference in the MVDR processed signal. Experimental results for the ChiME-3 challenge show that both the proposed MVDR beamformer with microphone gains and the MCNR postprocessing improve the speech recognition performance significantly. With the state-of-the-art deep neural network (DNN) based acoustic model, our system achieves a word error rate (WER) of 11.67% on the real test data of the evaluation set.

AB - This paper presents a robust speech recognition system using a microphone array for the 3rd CHiME Challenge. A minimum variance distortionless response (MVDR) beamformer with adaptive microphone gains is proposed for robust beamforming. Two microphone gain estimation methods are studied using the speech-dominant time-frequency bins. A multichannel noise reduction (MCNR) postprocessing is also proposed to further reduce the interference in the MVDR processed signal. Experimental results for the ChiME-3 challenge show that both the proposed MVDR beamformer with microphone gains and the MCNR postprocessing improve the speech recognition performance significantly. With the state-of-the-art deep neural network (DNN) based acoustic model, our system achieves a word error rate (WER) of 11.67% on the real test data of the evaluation set.

KW - CHiME 3

KW - MVDR beamforming

KW - microphone gain

KW - multichannel noise reduction

KW - robust speech recognition

UR - http://www.scopus.com/inward/record.url?scp=84964422570&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84964422570&partnerID=8YFLogxK

U2 - 10.1109/ASRU.2015.7404831

DO - 10.1109/ASRU.2015.7404831

M3 - Conference contribution

AN - SCOPUS:84964422570

T3 - 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

SP - 460

EP - 467

BT - 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, ASRU 2015 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -