TY - GEN
T1 - Use of particle filtering and MCMC for inference in Probabilistic Acoustic Tube model
AU - Wang, Ruobai
AU - Zhang, Yang
AU - Ou, Zhijian
AU - Hasegawa-Johnson, Mark
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/8/24
Y1 - 2016/8/24
N2 - The Probabilistic Acoustic Tube (PAT) model is a probabilistic generative model of speech. By associating every generative parameter with a probability distribution, it becomes possible to convert every standard speech analysis task into a probabilistic inference task, thereby grounding every such task with quantifiable measures of bias and consistency. The previously published PAT model did not adequately model AM-FM and therefore phase of the voice source. In this paper, we model the AM-FM of the voice source using an autoregressive process. The resulting model is a non-linear state-space model and thus has no closed-form inference algorithm, but effective inference can be achieved by using Auxiliary Particle Filtering (APF) and Taylor expansion assisted Markov Chain Monte Carlo (MCMC). Results demonstrate that, unlike previous speech models, this model is able to account for the phase of the voice source, achieving signal reconstruction with 8.79dB SNR.
AB - The Probabilistic Acoustic Tube (PAT) model is a probabilistic generative model of speech. By associating every generative parameter with a probability distribution, it becomes possible to convert every standard speech analysis task into a probabilistic inference task, thereby grounding every such task with quantifiable measures of bias and consistency. The previously published PAT model did not adequately model AM-FM and therefore phase of the voice source. In this paper, we model the AM-FM of the voice source using an autoregressive process. The resulting model is a non-linear state-space model and thus has no closed-form inference algorithm, but effective inference can be achieved by using Auxiliary Particle Filtering (APF) and Taylor expansion assisted Markov Chain Monte Carlo (MCMC). Results demonstrate that, unlike previous speech models, this model is able to account for the phase of the voice source, achieving signal reconstruction with 8.79dB SNR.
KW - MCMC
KW - Speech modeling
KW - particle filter
UR - http://www.scopus.com/inward/record.url?scp=84987858874&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84987858874&partnerID=8YFLogxK
U2 - 10.1109/SSP.2016.7551748
DO - 10.1109/SSP.2016.7551748
M3 - Conference contribution
AN - SCOPUS:84987858874
T3 - IEEE Workshop on Statistical Signal Processing Proceedings
BT - 2016 19th IEEE Statistical Signal Processing Workshop, SSP 2016
PB - IEEE Computer Society
T2 - 19th IEEE Statistical Signal Processing Workshop, SSP 2016
Y2 - 25 June 2016 through 29 June 2016
ER -