Joint acoustic and spectral modeling for speech dereverberation using non-negative representations

Nasser Mohammadiha, Paris Smaragdis, Simon Doclo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

This paper proposes a single-channel speech dereverberation method enhancing the spectrum of the reverberant speech signal. The proposed method uses a non-negative approximation of the convolutive transfer function (N-CTF) to simultaneously estimate the magnitude spectrograms of the speech signal and the room impulse response (RIR). To utilize the speech spectral structure, we propose to model the speech spectrum using non-negative matrix factorization, which is directly used in the N-CTF model resulting in a new cost function. We derive new estimators for the parameters by minimizing the obtained cost function. Additionally, to investigate the effect of the speech temporal dynamics for dereverberation, we use a frame stacking method and derive optimal estimators. Experiments are performed for two measured RIRs and the performance of the proposed method is compared to the performance of a state-of-the-art dereverberation method enhancing the speech spectrum. Experimental results show that the proposed method improved instrumental speech quality measures, where using speech temporal dynamics was found to be beneficial in severe reverberation conditions.

Original languageEnglish (US)
Title of host publication2015 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages4410-4414
Number of pages5
ISBN (Electronic)9781467369978
DOIs
StatePublished - Aug 4 2015
Event40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015 - Brisbane, Australia
Duration: Apr 19 2014Apr 24 2014

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2015-August
ISSN (Print)1520-6149

Other

Other40th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2015
Country/TerritoryAustralia
CityBrisbane
Period4/19/144/24/14

Keywords

  • Non-negative convolutive transfer function
  • dictionary-based processing
  • non-negative matrix factorization

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Joint acoustic and spectral modeling for speech dereverberation using non-negative representations'. Together they form a unique fingerprint.

Cite this