TY - GEN
T1 - Prediction based filtering and smoothing to exploit temporal dependencies in NMF
AU - Mohammadiha, Nasser
AU - Smaragdis, Paris
AU - Leijon, Arne
PY - 2013/10/18
Y1 - 2013/10/18
N2 - Nonnegative matrix factorization is an appealing technique for many audio applications. However, in it's basic form it does not use temporal structure, which is an important source of information in speech processing. In this paper, we propose NMF-based filtering and smoothing algorithms that are related to Kalman filtering and smoothing. While our prediction step is similar to that of Kalman filtering, we develop a multiplicative update step which is more convenient for nonnegative data analysis and in line with existing NMF literature. The proposed smoothing approach introduces an unavoidable processing delay, but the filtering algorithm does not and can be readily used for on-line applications. Our experiments using the proposed algorithms show a significant improvement over the baseline NMF approaches. In the case of speech denoising with factory noise at 0 dB input SNR, the smoothing algorithm outperforms NMF with 3.2 dB in SDR and around 0.5 MOS in PESQ, likewise source separation experiments result in improved performance due to taking advantage of the temporal regularities in speech.
AB - Nonnegative matrix factorization is an appealing technique for many audio applications. However, in it's basic form it does not use temporal structure, which is an important source of information in speech processing. In this paper, we propose NMF-based filtering and smoothing algorithms that are related to Kalman filtering and smoothing. While our prediction step is similar to that of Kalman filtering, we develop a multiplicative update step which is more convenient for nonnegative data analysis and in line with existing NMF literature. The proposed smoothing approach introduces an unavoidable processing delay, but the filtering algorithm does not and can be readily used for on-line applications. Our experiments using the proposed algorithms show a significant improvement over the baseline NMF approaches. In the case of speech denoising with factory noise at 0 dB input SNR, the smoothing algorithm outperforms NMF with 3.2 dB in SDR and around 0.5 MOS in PESQ, likewise source separation experiments result in improved performance due to taking advantage of the temporal regularities in speech.
KW - Nonnegative matrix factorization (NMF)
KW - Prediction
KW - Probabilistic latent component analysis (PLCA)
KW - Temporal dependencies
UR - http://www.scopus.com/inward/record.url?scp=84890522039&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84890522039&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2013.6637773
DO - 10.1109/ICASSP.2013.6637773
M3 - Conference contribution
AN - SCOPUS:84890522039
SN - 9781479903566
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 873
EP - 877
BT - 2013 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013 - Proceedings
T2 - 2013 38th IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2013
Y2 - 26 May 2013 through 31 May 2013
ER -