A Bayesian predictive method for automatic speech segmentation

Ming Liu, Thomas S. Huang

Research output: Contribution to journalConference article

Abstract

Implicit speech segmentation is basically to find time instances when the spectral distortion is large. Spectral Variation Function is a widely used measure of spectral distortion. However, SVF is a data-dependent measure. In order to make the measurement data-independent, a likelihood ratio is constructed to measure the spectral distortion. This ratio can be computed efficiently with a Bayesian predictive model. The prior of the Bayesian predictive model is estimated from unlabeled data via an unsupervised machine learning technique - Gaussian Mixture Model(GMM). The experimental results show that effectiveness of this novel method. The performance on TIMIT corpus indicate the potential applications in speech recognition, synthesis and coding.

Original languageEnglish (US)
Article number1699837
Pages (from-to)290-293
Number of pages4
JournalProceedings - International Conference on Pattern Recognition
Volume4
DOIs
StatePublished - Dec 1 2006
Event18th International Conference on Pattern Recognition, ICPR 2006 - Hong Kong, China
Duration: Aug 20 2006Aug 24 2006

Fingerprint

Speech recognition
Learning systems

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition

Cite this

A Bayesian predictive method for automatic speech segmentation. / Liu, Ming; Huang, Thomas S.

In: Proceedings - International Conference on Pattern Recognition, Vol. 4, 1699837, 01.12.2006, p. 290-293.

Research output: Contribution to journalConference article

@article{39359d8f03ae496ba4b6a120ec3f0e79,
title = "A Bayesian predictive method for automatic speech segmentation",
abstract = "Implicit speech segmentation is basically to find time instances when the spectral distortion is large. Spectral Variation Function is a widely used measure of spectral distortion. However, SVF is a data-dependent measure. In order to make the measurement data-independent, a likelihood ratio is constructed to measure the spectral distortion. This ratio can be computed efficiently with a Bayesian predictive model. The prior of the Bayesian predictive model is estimated from unlabeled data via an unsupervised machine learning technique - Gaussian Mixture Model(GMM). The experimental results show that effectiveness of this novel method. The performance on TIMIT corpus indicate the potential applications in speech recognition, synthesis and coding.",
author = "Ming Liu and Huang, {Thomas S.}",
year = "2006",
month = "12",
day = "1",
doi = "10.1109/ICPR.2006.38",
language = "English (US)",
volume = "4",
pages = "290--293",
journal = "Proceedings - International Conference on Pattern Recognition",
issn = "1051-4651",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - JOUR

T1 - A Bayesian predictive method for automatic speech segmentation

AU - Liu, Ming

AU - Huang, Thomas S.

PY - 2006/12/1

Y1 - 2006/12/1

N2 - Implicit speech segmentation is basically to find time instances when the spectral distortion is large. Spectral Variation Function is a widely used measure of spectral distortion. However, SVF is a data-dependent measure. In order to make the measurement data-independent, a likelihood ratio is constructed to measure the spectral distortion. This ratio can be computed efficiently with a Bayesian predictive model. The prior of the Bayesian predictive model is estimated from unlabeled data via an unsupervised machine learning technique - Gaussian Mixture Model(GMM). The experimental results show that effectiveness of this novel method. The performance on TIMIT corpus indicate the potential applications in speech recognition, synthesis and coding.

AB - Implicit speech segmentation is basically to find time instances when the spectral distortion is large. Spectral Variation Function is a widely used measure of spectral distortion. However, SVF is a data-dependent measure. In order to make the measurement data-independent, a likelihood ratio is constructed to measure the spectral distortion. This ratio can be computed efficiently with a Bayesian predictive model. The prior of the Bayesian predictive model is estimated from unlabeled data via an unsupervised machine learning technique - Gaussian Mixture Model(GMM). The experimental results show that effectiveness of this novel method. The performance on TIMIT corpus indicate the potential applications in speech recognition, synthesis and coding.

UR - http://www.scopus.com/inward/record.url?scp=34147133241&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=34147133241&partnerID=8YFLogxK

U2 - 10.1109/ICPR.2006.38

DO - 10.1109/ICPR.2006.38

M3 - Conference article

AN - SCOPUS:34147133241

VL - 4

SP - 290

EP - 293

JO - Proceedings - International Conference on Pattern Recognition

JF - Proceedings - International Conference on Pattern Recognition

SN - 1051-4651

M1 - 1699837

ER -