A learning-based approach to direction of arrival estimation in noisy and reverberant environments

Xiong Xiao, Shengkui Zhao, Xionghu Zhong, Douglas L Jones, Eng Siong Chng, Haizhou Li

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper presents a learning-based approach to the task of direction of arrival estimation (DOA) from microphone array input. Traditional signal processing methods such as the classic least square (LS) method rely on strong assumptions on signal models and accurate estimations of time delay of arrival (TDOA). They only work well in relatively clean conditions, but suffer from noise and reverberation distortions. In this paper, we propose a learning-based approach that can learn from a large amount of simulated noisy and reverberant microphone array inputs for robust DOA estimation. Specifically, we extract features from the generalised cross correlation (GCC) vectors and use a multilayer perceptron neural network to learn the nonlinear mapping from such features to the DOA. One advantage of the learning based method is that as more and more training data becomes available, the DOA estimation will become more and more accurate. Experimental results on simulated data show that the proposed learning based method produces much better results than the state-of-the-art LS method. The testing results on real data recorded in meeting rooms show improved root-mean-square error (RMSE) compared to the LS method.

Original languageEnglish (US)
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2814-2818
Number of pages5
ISBN (Electronic)9781467369978
DOIs
StatePublished - Aug 4 2015
Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
Duration: Apr 19 2014Apr 24 2014

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2015-August
ISSN (Print)1520-6149

Other

Other40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
CountryAustralia
CityBrisbane
Period4/19/144/24/14

Fingerprint

Direction of arrival
Microphones
Reverberation
Multilayer neural networks
Acoustic noise
Mean square error
Time delay
Signal processing
Neural networks
Testing

Keywords

  • direction of arrival
  • least squares
  • machine learning
  • microphone arrays
  • neural networks

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Cite this

Xiao, X., Zhao, S., Zhong, X., Jones, D. L., Chng, E. S., & Li, H. (2015). A learning-based approach to direction of arrival estimation in noisy and reverberant environments. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings (pp. 2814-2818). [7178484] (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2015-August). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/ICASSP.2015.7178484

A learning-based approach to direction of arrival estimation in noisy and reverberant environments. / Xiao, Xiong; Zhao, Shengkui; Zhong, Xionghu; Jones, Douglas L; Chng, Eng Siong; Li, Haizhou.

2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2015. p. 2814-2818 7178484 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings; Vol. 2015-August).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Xiao, X, Zhao, S, Zhong, X, Jones, DL, Chng, ES & Li, H 2015, A learning-based approach to direction of arrival estimation in noisy and reverberant environments. in 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings., 7178484, ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, vol. 2015-August, Institute of Electrical and Electronics Engineers Inc., pp. 2814-2818, 40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015, Brisbane, Australia, 4/19/14. https://doi.org/10.1109/ICASSP.2015.7178484
Xiao X, Zhao S, Zhong X, Jones DL, Chng ES, Li H. A learning-based approach to direction of arrival estimation in noisy and reverberant environments. In 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2015. p. 2814-2818. 7178484. (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings). https://doi.org/10.1109/ICASSP.2015.7178484
Xiao, Xiong ; Zhao, Shengkui ; Zhong, Xionghu ; Jones, Douglas L ; Chng, Eng Siong ; Li, Haizhou. / A learning-based approach to direction of arrival estimation in noisy and reverberant environments. 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 2814-2818 (ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings).
@inproceedings{3a2433d675574d3fa9c8795535dbda96,
title = "A learning-based approach to direction of arrival estimation in noisy and reverberant environments",
abstract = "This paper presents a learning-based approach to the task of direction of arrival estimation (DOA) from microphone array input. Traditional signal processing methods such as the classic least square (LS) method rely on strong assumptions on signal models and accurate estimations of time delay of arrival (TDOA). They only work well in relatively clean conditions, but suffer from noise and reverberation distortions. In this paper, we propose a learning-based approach that can learn from a large amount of simulated noisy and reverberant microphone array inputs for robust DOA estimation. Specifically, we extract features from the generalised cross correlation (GCC) vectors and use a multilayer perceptron neural network to learn the nonlinear mapping from such features to the DOA. One advantage of the learning based method is that as more and more training data becomes available, the DOA estimation will become more and more accurate. Experimental results on simulated data show that the proposed learning based method produces much better results than the state-of-the-art LS method. The testing results on real data recorded in meeting rooms show improved root-mean-square error (RMSE) compared to the LS method.",
keywords = "direction of arrival, least squares, machine learning, microphone arrays, neural networks",
author = "Xiong Xiao and Shengkui Zhao and Xionghu Zhong and Jones, {Douglas L} and Chng, {Eng Siong} and Haizhou Li",
year = "2015",
month = "8",
day = "4",
doi = "10.1109/ICASSP.2015.7178484",
language = "English (US)",
series = "ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "2814--2818",
booktitle = "2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings",
address = "United States",

}

TY - GEN

T1 - A learning-based approach to direction of arrival estimation in noisy and reverberant environments

AU - Xiao, Xiong

AU - Zhao, Shengkui

AU - Zhong, Xionghu

AU - Jones, Douglas L

AU - Chng, Eng Siong

AU - Li, Haizhou

PY - 2015/8/4

Y1 - 2015/8/4

N2 - This paper presents a learning-based approach to the task of direction of arrival estimation (DOA) from microphone array input. Traditional signal processing methods such as the classic least square (LS) method rely on strong assumptions on signal models and accurate estimations of time delay of arrival (TDOA). They only work well in relatively clean conditions, but suffer from noise and reverberation distortions. In this paper, we propose a learning-based approach that can learn from a large amount of simulated noisy and reverberant microphone array inputs for robust DOA estimation. Specifically, we extract features from the generalised cross correlation (GCC) vectors and use a multilayer perceptron neural network to learn the nonlinear mapping from such features to the DOA. One advantage of the learning based method is that as more and more training data becomes available, the DOA estimation will become more and more accurate. Experimental results on simulated data show that the proposed learning based method produces much better results than the state-of-the-art LS method. The testing results on real data recorded in meeting rooms show improved root-mean-square error (RMSE) compared to the LS method.

AB - This paper presents a learning-based approach to the task of direction of arrival estimation (DOA) from microphone array input. Traditional signal processing methods such as the classic least square (LS) method rely on strong assumptions on signal models and accurate estimations of time delay of arrival (TDOA). They only work well in relatively clean conditions, but suffer from noise and reverberation distortions. In this paper, we propose a learning-based approach that can learn from a large amount of simulated noisy and reverberant microphone array inputs for robust DOA estimation. Specifically, we extract features from the generalised cross correlation (GCC) vectors and use a multilayer perceptron neural network to learn the nonlinear mapping from such features to the DOA. One advantage of the learning based method is that as more and more training data becomes available, the DOA estimation will become more and more accurate. Experimental results on simulated data show that the proposed learning based method produces much better results than the state-of-the-art LS method. The testing results on real data recorded in meeting rooms show improved root-mean-square error (RMSE) compared to the LS method.

KW - direction of arrival

KW - least squares

KW - machine learning

KW - microphone arrays

KW - neural networks

UR - http://www.scopus.com/inward/record.url?scp=84946079951&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84946079951&partnerID=8YFLogxK

U2 - 10.1109/ICASSP.2015.7178484

DO - 10.1109/ICASSP.2015.7178484

M3 - Conference contribution

AN - SCOPUS:84946079951

T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

SP - 2814

EP - 2818

BT - 2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

ER -