Latent variable decomposition of spectrograms for single channel speaker separation

Bhiksha Raj, Paris Smaragdis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper we present an algorithm for the separation of multiple speakers from mixed single-channel recordings by latent variable decomposition of the speech spectrogram. We model each magnitude spectral vector in the short-time Fourier transform of a speech signal as the outcome of a discrete random process that generates frequency bin indices. The distribution of the process is modelled a mixture of multinomial distributions, such that the mixture weights of the component multinomials vary from analysis window to analysis window. The component multinomials are assumed to be speaker specific and are learnt from training signals for each speaker. The distributions representing magnitude spectral vectors for the mixed signal are decomposed into mixtures of the multinomials for all component speakers. The frequency distribution, i.e. the spectrum for each speaker is reconstructed from this decomposition. Experimental results show that the proposed method is very effective at separating mixed signals.

Original languageEnglish (US)
Title of host publication2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Pages17-20
Number of pages4
DOIs
StatePublished - 2005
Externally publishedYes
Event2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics - New Paltz, NY, United States
Duration: Oct 16 2005Oct 19 2005

Publication series

NameIEEE Workshop on Applications of Signal Processing to Audio and Acoustics

Other

Other2005 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Country/TerritoryUnited States
CityNew Paltz, NY
Period10/16/0510/19/05

ASJC Scopus subject areas

  • Signal Processing

Fingerprint

Dive into the research topics of 'Latent variable decomposition of spectrograms for single channel speaker separation'. Together they form a unique fingerprint.

Cite this