Collaborative speech dereverberation: Regularized tensor factorization for crowdsourced multi-channel recordings

Sanna Wager, Minje Kim

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We propose a regularized nonnegative tensor factorization (NTF) model for multi-channel speech dereverberation that incorporates prior knowledge about clean speech. The approach models the problem as recovering a signal convolved with different room impulse responses, allowing the dereverberation problem to benefit from microphone arrays. The factorization learns both individual reverberation filters and channel-specific delays, which makes it possible to employ an ad-hoc microphone array with heterogeneous sensors (such as multi-channel recordings by a crowd) even if they are not synchronized. We integrate two prior-knowledge regularization schemes to increase the stability of dereverberation performance. First, a Nonnegative Matrix Factorization (NMF) inner routine is introduced to inform the original NTF problem of the pre-trained clean speech basis vectors, so that the optimization process can focus on estimating their activations rather than the whole clean speech spectra. Second, the NMF activation matrix is further regularized to take on characteristics of dry signals using sparsity and smoothness constraints. Empirical dereverberation results on different simulated reverberation setups show that the prior-knowledge regularization schemes improve both recovered sound quality and speech intelligibility compared to a baseline NTF approach.

Original languageEnglish (US)
Title of host publication2018 26th European Signal Processing Conference, EUSIPCO 2018
PublisherEuropean Signal Processing Conference, EUSIPCO
Pages1532-1536
Number of pages5
ISBN (Electronic)9789082797015
DOIs
StatePublished - Nov 29 2018
Externally publishedYes
Event26th European Signal Processing Conference, EUSIPCO 2018 - Rome, Italy
Duration: Sep 3 2018Sep 7 2018

Publication series

NameEuropean Signal Processing Conference
Volume2018-September
ISSN (Print)2219-5491

Conference

Conference26th European Signal Processing Conference, EUSIPCO 2018
Country/TerritoryItaly
CityRome
Period9/3/189/7/18

Keywords

  • Collaborative audio enhancement
  • multi-channel dereverberation
  • Nonnegative matrix factorization
  • Nonnegative tensor factorization
  • Speech enhancement

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Collaborative speech dereverberation: Regularized tensor factorization for crowdsourced multi-channel recordings'. Together they form a unique fingerprint.

Cite this