A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics

Gautham J. Mysore, Paris Smaragdis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a semi-supervised source separation methodology to denoise speech by modeling speech as one source and noise as the other source. We model speech using the recently proposed non-negative hidden Markov model, which uses multiple non-negative dictionaries and a Markov chain to jointly model spectral structure and temporal dynamics of speech. We perform separation of the speech and noise using the recently proposed non-negative factorial hidden Markov model. Although the speech model is learned from training data, the noise model is learned during the separation process and requires no training data. We show that the proposed method achieves superior results to using non-negative spectrogram factorization, which ignores the non-stationarity and temporal dynamics of speech.

Original languageEnglish (US)
Title of host publication2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings
Pages17-20
Number of pages4
DOIs
StatePublished - 2011
Event36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Prague, Czech Republic
Duration: May 22 2011May 27 2011

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Other

Other36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
Country/TerritoryCzech Republic
CityPrague
Period5/22/115/27/11

Keywords

  • Denoising
  • Semi-supervised source separation

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics'. Together they form a unique fingerprint.

Cite this