TY - GEN
T1 - Noise-robust dynamic time warping using PLCA features
AU - King, Brian
AU - Smaragdis, Paris
AU - Mysore, Gautham J.
PY - 2012
Y1 - 2012
N2 - Conventional speech features, such as mel-frequency cepstral coefficients, tend to perform well in template matching systems, such as dynamic time warping, in low noise conditions. However, they tend to degrade in noisy environments. We propose a method of calculating features using the probabilistic latent component analysis (PLCA) framework. This framework models the speech and noise separately, leading to higher performance in noisy conditions than conventional methods. In this work, we compare our PLCA-based features with conventional features on the task of aligning a high-fidelity speech recording to a noisy speech recording, a scenario common in automatic dialogue replacement.
AB - Conventional speech features, such as mel-frequency cepstral coefficients, tend to perform well in template matching systems, such as dynamic time warping, in low noise conditions. However, they tend to degrade in noisy environments. We propose a method of calculating features using the probabilistic latent component analysis (PLCA) framework. This framework models the speech and noise separately, leading to higher performance in noisy conditions than conventional methods. In this work, we compare our PLCA-based features with conventional features on the task of aligning a high-fidelity speech recording to a noisy speech recording, a scenario common in automatic dialogue replacement.
KW - Automatic Dialogue Replacement
KW - Dynamic Time Warping
KW - Probabilistic Latent Component Analysis
UR - http://www.scopus.com/inward/record.url?scp=84867586709&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867586709&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2012.6288293
DO - 10.1109/ICASSP.2012.6288293
M3 - Conference contribution
AN - SCOPUS:84867586709
SN - 9781467300469
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 1973
EP - 1976
BT - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012 - Proceedings
T2 - 2012 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2012
Y2 - 25 March 2012 through 30 March 2012
ER -