TY - GEN
T1 - A non-negative approach to semi-supervised separation of speech from noise with the use of temporal dynamics
AU - Mysore, Gautham J.
AU - Smaragdis, Paris
PY - 2011
Y1 - 2011
N2 - We present a semi-supervised source separation methodology to denoise speech by modeling speech as one source and noise as the other source. We model speech using the recently proposed non-negative hidden Markov model, which uses multiple non-negative dictionaries and a Markov chain to jointly model spectral structure and temporal dynamics of speech. We perform separation of the speech and noise using the recently proposed non-negative factorial hidden Markov model. Although the speech model is learned from training data, the noise model is learned during the separation process and requires no training data. We show that the proposed method achieves superior results to using non-negative spectrogram factorization, which ignores the non-stationarity and temporal dynamics of speech.
AB - We present a semi-supervised source separation methodology to denoise speech by modeling speech as one source and noise as the other source. We model speech using the recently proposed non-negative hidden Markov model, which uses multiple non-negative dictionaries and a Markov chain to jointly model spectral structure and temporal dynamics of speech. We perform separation of the speech and noise using the recently proposed non-negative factorial hidden Markov model. Although the speech model is learned from training data, the noise model is learned during the separation process and requires no training data. We show that the proposed method achieves superior results to using non-negative spectrogram factorization, which ignores the non-stationarity and temporal dynamics of speech.
KW - Denoising
KW - Semi-supervised source separation
UR - http://www.scopus.com/inward/record.url?scp=80051625972&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80051625972&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2011.5946317
DO - 10.1109/ICASSP.2011.5946317
M3 - Conference contribution
AN - SCOPUS:80051625972
SN - 9781457705397
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 17
EP - 20
BT - 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011 - Proceedings
T2 - 36th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2011
Y2 - 22 May 2011 through 27 May 2011
ER -