Mixtures of local dictionaries for unsupervised speech enhancement

Research output: Contribution to journalArticlepeer-review

Abstract

We propose a novel extension of Nonnegative Matrix Factorization (NMF) that models a signal with multiple local dictionaries activated sparsely. This set of local dictionaries for a source, e.g., speech, disjointly constitute a superset that is more discriminative than an ordinary NMF dictionary, because its local structures represent the source's manifold better. A block sparsity constraint is used to regularize the NMF solutions so that only one or a small number of blocks are active at a given time. Moreover, a concentrationz prior further regularizes each block of bases to be close to each other for better locality preservation. We test the proposed Mixture of Local Dictionaries (MLD) on single-channel speech enhancement tasks and show that it outperforms the state of the art technology by up to 2 dB in signal-to-distortion ratio, especially in the unsupervised environment where neither the speaker identity nor the type of noise is known in advance.

Original languageEnglish (US)
Article number6874558
Pages (from-to)288-292
Number of pages5
JournalIEEE Signal Processing Letters
Volume22
Issue number3
DOIs
StatePublished - Mar 1 2015

Keywords

  • Manifold learning
  • nonnegative matrix factorization
  • speech enhancement

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Mixtures of local dictionaries for unsupervised speech enhancement'. Together they form a unique fingerprint.

Cite this